Inicializácia

Načítanie knižníc

## Loading required package: Hmisc
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
## funModeling v.1.9.4 :)
## Examples and tutorials at livebook.datascienceheroes.com
##  / Now in Spanish: librovivodecienciadedatos.ai
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ tibble  3.1.0     ✓ dplyr   1.0.5
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ✓ purrr   0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::between()   masks data.table::between()
## x dplyr::filter()    masks stats::filter()
## x dplyr::first()     masks data.table::first()
## x dplyr::lag()       masks stats::lag()
## x dplyr::last()      masks data.table::last()
## x dplyr::src()       masks Hmisc::src()
## x dplyr::summarize() masks Hmisc::summarize()
## x purrr::transpose() masks data.table::transpose()
## 
## Attaching package: 'bnlearn'
## The following object is masked from 'package:Hmisc':
## 
##     impute

Načítanie dát

Vybrali sme si dataset “Used Cars Dataset from Craigslist.org” dostupný na: https://www.kaggle.com/austinreese/craigslist-carstrucks-data.

data <- read.csv(file = 'data/vehicles.csv', header = TRUE, na.strings=c(""," ","NA","0"))
data = subset(data, select = -1)

Prieskumná analýza

V tejto časti si prejdeme a popíšeme dataset. Bližšie sa pozrieme akých sú hodnôt a čo popisujú.

c('number of columns:', ncol(data))
## [1] "number of columns:" "25"
c('number of rows',nrow(data))
## [1] "number of rows" "458213"

K dispozícii máme necelých 460 tisíc záznamov.

Popis atribútov

names(data)
##  [1] "id"           "url"          "region"       "region_url"   "price"       
##  [6] "year"         "manufacturer" "model"        "condition"    "cylinders"   
## [11] "fuel"         "odometer"     "title_status" "transmission" "VIN"         
## [16] "drive"        "size"         "type"         "paint_color"  "image_url"   
## [21] "description"  "state"        "lat"          "long"         "posting_date"

Atribútov máme celkovo 25.

  • id - unikátny identifikátor záznamu - číselný 10 miestny údaj
  • url - celá cesta, odkiaľ sa inzerát na vozidlo stiahol - url/string
  • region - oblasť, z akého sa auto predáva - string
  • region_url - cesta ku kategorii regionu na danom inzerčnom portáli - url/string
  • price - kúplno/predajná cena vozdila - číselný údaj
  • year - vyrobný rok vozdila - číselný údaj
  • manufacturer - výrobca vozidla - string
  • model - model vozdila - string
  • condition - stav vozidla v akom sa vozdilo nachádza
  • cylinders - počet valcov mootra daného vozidla
  • fuel - typ paliva - string
  • odometer - počet najazdených míľ - číselný údaj
  • title_status - stav vozidla
  • transmission - typ prevodovky vozidla - string
  • VIN - (doplnkový údaj) identifikačné číslo vozdila
  • drive - pohon vozidla - string
  • size - označenie veľkosti vozidla napriklad full-size - string
  • type - karosárske vyhotovenie vozidla - sedan, kupé, suv - string
  • pain_color - farba vozidla - string
  • image_url - cesta k obrázku daného vozidla - url/string
  • description - popis vozidla
  • state - oblasť predaja vozidla
  • lat a long - súradnice
  • posting_data - dátum uverejnenia inzerátu

Nižšie máme menšiu ukážku dát a ich bližšiu špecifikáciu.

head(data,5)
##           id
## 1 7240372487
## 2 7240309422
## 3 7240224296
## 4 7240103965
## 5 7239983776
##                                                                                          url
## 1 https://auburn.craigslist.org/ctd/d/auburn-university-2010-chevy-chevrolet/7240372487.html
## 2         https://auburn.craigslist.org/cto/d/auburn-2014-hyundai-sonata-20t/7240309422.html
## 3                     https://auburn.craigslist.org/cto/d/auburn-2006-bmw-x3/7240224296.html
## 4                           https://auburn.craigslist.org/cto/d/lanett-truck/7240103965.html
## 5           https://auburn.craigslist.org/cto/d/auburn-2005-ford-f350-lariat/7239983776.html
##   region                    region_url price year manufacturer
## 1 auburn https://auburn.craigslist.org 35990 2010    chevrolet
## 2 auburn https://auburn.craigslist.org  7500 2014      hyundai
## 3 auburn https://auburn.craigslist.org  4900 2006          bmw
## 4 auburn https://auburn.craigslist.org  2000 1974    chevrolet
## 5 auburn https://auburn.craigslist.org 19500 2005         ford
##                  model condition   cylinders   fuel odometer title_status
## 1 corvette grand sport      good 8 cylinders    gas    32742        clean
## 2               sonata excellent 4 cylinders    gas    93600        clean
## 3              x3 3.0i      good 6 cylinders    gas    87046        clean
## 4                 c-10      good 4 cylinders    gas   190000        clean
## 5          f350 lariat excellent 8 cylinders diesel   116000         lien
##   transmission               VIN drive      size   type paint_color
## 1        other 1G1YU3DW1A5106980   rwd      <NA>  other        <NA>
## 2    automatic 5NPEC4AB0EH813529   fwd      <NA>  sedan        <NA>
## 3    automatic              <NA>  <NA>      <NA>    SUV        blue
## 4    automatic              <NA>   rwd full-size pickup        blue
## 5    automatic              <NA>   4wd full-size pickup        blue
##                                                            image_url
## 1 https://images.craigslist.org/00N0N_ipkbHVZYf4w_0gw0co_600x450.jpg
## 2 https://images.craigslist.org/00s0s_gBHYmJ5o7yM_0ne0hq_600x450.jpg
## 3 https://images.craigslist.org/00B0B_5zgEGWPOrt0_07L0ak_600x450.jpg
## 4 https://images.craigslist.org/00M0M_6o7KcDpArwl_0CI0t2_600x450.jpg
## 5 https://images.craigslist.org/00p0p_b95l1EgUfly_0CI0t2_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          description
## 1 Carvana is the safer way to buy a car During these uncertain times, Carvana is dedicated to ensuring safety for all of our customers. In addition to our 100% online shopping and selling experience that allows all customers to buy and trade their cars without ever leaving the safety of their house, we’re providing touchless delivery that make all aspects of our process even safer. Now, you can get the car you want, and trade in your old one, while avoiding person-to-person contact with our friendly advocates. There are some things that can’t be put off. And if buying a car is one of them, know that we’re doing everything we can to keep you keep moving while continuing to put your health safety, and happiness first. Vehicle Stock# 2000721559📱 Want to instantly check this car’s availability? Call us at  334-758-9176Just text that stock number to 855-976-4304 or head to http://www.carvanaauto.com/6143424-74502 and plug it into the search bar!Get PRE-QUALIFIED for your auto loan in 2 minutes - no hit to your credit:http://finance.carvanaauto.com/6143424-74502Looking for more cars like this one? We have 94 Chevrolet Corvette in stock for as low as $27990!Why buy with Carvana? We have one standard: the highest. Take a look at just some of the qualifications all of our cars must meet before we list them.150-POINT INSPECTION: We put each vehicle through a 150-point inspection so that you can be 100% confident in its quality and safety. See everything that goes into our inspections at:http://www.carvanaauto.com/6143424-74502NO REPORTED ACCIDENTS: We do not sell cars that have been in a reported accident or have a frame or structural damage.7 DAY TEST OWN MONEY BACK GUARANTEE: Every Carvana car comes with a 7-day money-back guarantee. Why? It takes more than 15-minutes to make a decision on your next car. Learn more about test owning at http://about.carvanaauto.comFLEXIBLE FINANCING, TRADE INS WELCOME: We’re all about real-time financing without the middle man. Need financing? Pick a combination of down and monthly payments that work for you. Have a trade-in? We’ll give you a value in 2 minutes. Check out everything about our financing at:http://finance.carvanaauto.com/6143424-74502COST SAVINGS: Carvana's business model has fewer expenses and no bloated fees compared to your local dealership. See how much we can save you at http://about.carvanaauto.comPREMIUM DETAIL: We go the extra mile so that your car is looking as good as new. There are a lot of specifics that we won’t list here (we wash, clean, buff, paint, polish, wax, seal), but trust us that when your car arrives, it’s going to look sweet.Vehicle Info for Stock# 2000721559Trim: Grand Sport Convertible 2D ConvertibleMileage: 32k milesExterior Color: GrayInterior Color: BlackEngine: 6.2L V8 430hp 424ft. lbs.Drive: rwdTransmission: Automatic, 6-Spd w/Paddle ShiftVIN: 1G1YU3DW1A5106980Dealer Disclosure: Price excludes tax, title, and registration (which we handle for you).Disclaimer: You agree that by providing your phone number, Carvana, or Carvana’s authorized representatives*, may call and/or send text messages (including by using equipment to automatically dial telephone numbers) about your interest in a purchase, for marketing/sales purposes, or for any other servicing or informational purpose related to your account. You do not have to consent to receiving calls or texts to purchase from Carvana. While every reasonable effort is made to ensure the accuracy of the information for this Chevrolet Corvette, we are not responsible for any errors or omissions contained in this ad. Please verify any information in question with Carvana at 334-758-9176*Including, but not limited to, Bridgecrest Credit Company, GO Financial and SilverRock Automotive.*Chevrolet* *Corvette* *Chevy* *Chevrolet* *Corvette* *vZR1* *Chevrolet* *Corvette* *Z06* *Hardtop* *Chevrolet* *Corvette* *Stingray* *Chevrolet* *Corvette* *3* *Lt* *Chevrolet* *Corvette* *C5-R* *Chevrolet* *Corvette* *Grand* *Sport* *Chevrolet* *Corvette* *Corvette* *C6* *ZR1* *Chevrolet* *Corvette* *2LT* *Chevrolet* *Corvette* *4LT* *Sports* *Car* *Coupe* 2021  2020  2019  2018  2017  2016  2015  2014  2013  2012  2011  2010  2009  2008  2007  2006  2005  2004  2003  2002  2001  2000   21    19  18  17  16  15  14  13  12  11  10  09  08  07  06  05  04  03  02  01  00
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         I'll move to another city and try to sell my car. The car is in very good condition,  everything works and fully cleaned. It equipped with a heated seat, power seat, backup camera, Bluetooth, keyless entry and start. If you are interested in my car, please email me.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Clean 2006 BMW X3 3.0I.  Beautiful and rare Blue Water Metallic exterior and tan interior color combination. 5-speed automatic transmission, AWD, CD/AM/FM radio, cold A/C (just serviced along w/oil change), alloy wheels, split rear-seats, driver/passenger air bags, multi-function remote w/ keyless entry, electric windows/door locks, cruise control, lighted vanity mirrors and many other extras.  Missing tow eye cover in rear (~$20 to replace), tires ~50% tread remaining and a few blemishes on the exterior (scuffs, scratches, normal wear & tear).  Would make an excellent, safe run-around town or college vehicle.  Title in hand, priced to sell!
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      1974 chev. truck (LONG BED) NEW starter front and back breaks
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            2005 Ford F350 Lariat (Bullet Proofed). This truck was bullet proofed early on and has been well maintained. Truck is equiped with a 6.0 liter turbo diesel. Currently has 116K miles. Everything on the truck works as it should, truck is in excellent condition. Truck is in all original condition (except for the Bullet Proof upgrades). Truck comes equipped with gooseneck hitch, 15,000 lbs bumper hitch, brake controller, and upfitter switches. Has 430 limited slip gears. Fully loaded interior with heated leather power seats and power sliding back glass. It is an excellent choice for hauling a 5th wheel camper or for anyone needing to haul heavy loads. If you are looking for a pre-emission controlled diesel, you will not find a better truck than this one. Price is firm, Call Mark at  show contact info
##   state      lat      long             posting_date
## 1    al 32.59000 -85.48000 2020-12-02T08:11:30-0600
## 2    al 32.54750 -85.46820 2020-12-02T02:11:50-0600
## 3    al 32.61681 -85.46415 2020-12-01T19:50:41-0600
## 4    al 32.86160 -85.21610 2020-12-01T15:54:45-0600
## 5    al 32.54750 -85.46820 2020-12-01T12:53:56-0600

Prázdne hodnoty

Pozrime sa na počet záznamov v datasete, ktoré obsahujú nejakú chýbajúcu hodnotu atribútu.

na_rows <- data[rowSums(is.na(data)) > 0,]
nrow(na_rows)
## [1] 418512

Vidíme že je to dosť veľký počet dát. Môže za to pridanie parametra na.strings=c(""," ","NA","0") pri načítaní súborov. Môžme si pozrieť malú vzorku chýbajúcich dát:

na_rows[1:5,]
##           id
## 1 7240372487
## 2 7240309422
## 3 7240224296
## 4 7240103965
## 5 7239983776
##                                                                                          url
## 1 https://auburn.craigslist.org/ctd/d/auburn-university-2010-chevy-chevrolet/7240372487.html
## 2         https://auburn.craigslist.org/cto/d/auburn-2014-hyundai-sonata-20t/7240309422.html
## 3                     https://auburn.craigslist.org/cto/d/auburn-2006-bmw-x3/7240224296.html
## 4                           https://auburn.craigslist.org/cto/d/lanett-truck/7240103965.html
## 5           https://auburn.craigslist.org/cto/d/auburn-2005-ford-f350-lariat/7239983776.html
##   region                    region_url price year manufacturer
## 1 auburn https://auburn.craigslist.org 35990 2010    chevrolet
## 2 auburn https://auburn.craigslist.org  7500 2014      hyundai
## 3 auburn https://auburn.craigslist.org  4900 2006          bmw
## 4 auburn https://auburn.craigslist.org  2000 1974    chevrolet
## 5 auburn https://auburn.craigslist.org 19500 2005         ford
##                  model condition   cylinders   fuel odometer title_status
## 1 corvette grand sport      good 8 cylinders    gas    32742        clean
## 2               sonata excellent 4 cylinders    gas    93600        clean
## 3              x3 3.0i      good 6 cylinders    gas    87046        clean
## 4                 c-10      good 4 cylinders    gas   190000        clean
## 5          f350 lariat excellent 8 cylinders diesel   116000         lien
##   transmission               VIN drive      size   type paint_color
## 1        other 1G1YU3DW1A5106980   rwd      <NA>  other        <NA>
## 2    automatic 5NPEC4AB0EH813529   fwd      <NA>  sedan        <NA>
## 3    automatic              <NA>  <NA>      <NA>    SUV        blue
## 4    automatic              <NA>   rwd full-size pickup        blue
## 5    automatic              <NA>   4wd full-size pickup        blue
##                                                            image_url
## 1 https://images.craigslist.org/00N0N_ipkbHVZYf4w_0gw0co_600x450.jpg
## 2 https://images.craigslist.org/00s0s_gBHYmJ5o7yM_0ne0hq_600x450.jpg
## 3 https://images.craigslist.org/00B0B_5zgEGWPOrt0_07L0ak_600x450.jpg
## 4 https://images.craigslist.org/00M0M_6o7KcDpArwl_0CI0t2_600x450.jpg
## 5 https://images.craigslist.org/00p0p_b95l1EgUfly_0CI0t2_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          description
## 1 Carvana is the safer way to buy a car During these uncertain times, Carvana is dedicated to ensuring safety for all of our customers. In addition to our 100% online shopping and selling experience that allows all customers to buy and trade their cars without ever leaving the safety of their house, we’re providing touchless delivery that make all aspects of our process even safer. Now, you can get the car you want, and trade in your old one, while avoiding person-to-person contact with our friendly advocates. There are some things that can’t be put off. And if buying a car is one of them, know that we’re doing everything we can to keep you keep moving while continuing to put your health safety, and happiness first. Vehicle Stock# 2000721559📱 Want to instantly check this car’s availability? Call us at  334-758-9176Just text that stock number to 855-976-4304 or head to http://www.carvanaauto.com/6143424-74502 and plug it into the search bar!Get PRE-QUALIFIED for your auto loan in 2 minutes - no hit to your credit:http://finance.carvanaauto.com/6143424-74502Looking for more cars like this one? We have 94 Chevrolet Corvette in stock for as low as $27990!Why buy with Carvana? We have one standard: the highest. Take a look at just some of the qualifications all of our cars must meet before we list them.150-POINT INSPECTION: We put each vehicle through a 150-point inspection so that you can be 100% confident in its quality and safety. See everything that goes into our inspections at:http://www.carvanaauto.com/6143424-74502NO REPORTED ACCIDENTS: We do not sell cars that have been in a reported accident or have a frame or structural damage.7 DAY TEST OWN MONEY BACK GUARANTEE: Every Carvana car comes with a 7-day money-back guarantee. Why? It takes more than 15-minutes to make a decision on your next car. Learn more about test owning at http://about.carvanaauto.comFLEXIBLE FINANCING, TRADE INS WELCOME: We’re all about real-time financing without the middle man. Need financing? Pick a combination of down and monthly payments that work for you. Have a trade-in? We’ll give you a value in 2 minutes. Check out everything about our financing at:http://finance.carvanaauto.com/6143424-74502COST SAVINGS: Carvana's business model has fewer expenses and no bloated fees compared to your local dealership. See how much we can save you at http://about.carvanaauto.comPREMIUM DETAIL: We go the extra mile so that your car is looking as good as new. There are a lot of specifics that we won’t list here (we wash, clean, buff, paint, polish, wax, seal), but trust us that when your car arrives, it’s going to look sweet.Vehicle Info for Stock# 2000721559Trim: Grand Sport Convertible 2D ConvertibleMileage: 32k milesExterior Color: GrayInterior Color: BlackEngine: 6.2L V8 430hp 424ft. lbs.Drive: rwdTransmission: Automatic, 6-Spd w/Paddle ShiftVIN: 1G1YU3DW1A5106980Dealer Disclosure: Price excludes tax, title, and registration (which we handle for you).Disclaimer: You agree that by providing your phone number, Carvana, or Carvana’s authorized representatives*, may call and/or send text messages (including by using equipment to automatically dial telephone numbers) about your interest in a purchase, for marketing/sales purposes, or for any other servicing or informational purpose related to your account. You do not have to consent to receiving calls or texts to purchase from Carvana. While every reasonable effort is made to ensure the accuracy of the information for this Chevrolet Corvette, we are not responsible for any errors or omissions contained in this ad. Please verify any information in question with Carvana at 334-758-9176*Including, but not limited to, Bridgecrest Credit Company, GO Financial and SilverRock Automotive.*Chevrolet* *Corvette* *Chevy* *Chevrolet* *Corvette* *vZR1* *Chevrolet* *Corvette* *Z06* *Hardtop* *Chevrolet* *Corvette* *Stingray* *Chevrolet* *Corvette* *3* *Lt* *Chevrolet* *Corvette* *C5-R* *Chevrolet* *Corvette* *Grand* *Sport* *Chevrolet* *Corvette* *Corvette* *C6* *ZR1* *Chevrolet* *Corvette* *2LT* *Chevrolet* *Corvette* *4LT* *Sports* *Car* *Coupe* 2021  2020  2019  2018  2017  2016  2015  2014  2013  2012  2011  2010  2009  2008  2007  2006  2005  2004  2003  2002  2001  2000   21    19  18  17  16  15  14  13  12  11  10  09  08  07  06  05  04  03  02  01  00
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         I'll move to another city and try to sell my car. The car is in very good condition,  everything works and fully cleaned. It equipped with a heated seat, power seat, backup camera, Bluetooth, keyless entry and start. If you are interested in my car, please email me.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Clean 2006 BMW X3 3.0I.  Beautiful and rare Blue Water Metallic exterior and tan interior color combination. 5-speed automatic transmission, AWD, CD/AM/FM radio, cold A/C (just serviced along w/oil change), alloy wheels, split rear-seats, driver/passenger air bags, multi-function remote w/ keyless entry, electric windows/door locks, cruise control, lighted vanity mirrors and many other extras.  Missing tow eye cover in rear (~$20 to replace), tires ~50% tread remaining and a few blemishes on the exterior (scuffs, scratches, normal wear & tear).  Would make an excellent, safe run-around town or college vehicle.  Title in hand, priced to sell!
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      1974 chev. truck (LONG BED) NEW starter front and back breaks
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            2005 Ford F350 Lariat (Bullet Proofed). This truck was bullet proofed early on and has been well maintained. Truck is equiped with a 6.0 liter turbo diesel. Currently has 116K miles. Everything on the truck works as it should, truck is in excellent condition. Truck is in all original condition (except for the Bullet Proof upgrades). Truck comes equipped with gooseneck hitch, 15,000 lbs bumper hitch, brake controller, and upfitter switches. Has 430 limited slip gears. Fully loaded interior with heated leather power seats and power sliding back glass. It is an excellent choice for hauling a 5th wheel camper or for anyone needing to haul heavy loads. If you are looking for a pre-emission controlled diesel, you will not find a better truck than this one. Price is firm, Call Mark at  show contact info
##   state      lat      long             posting_date
## 1    al 32.59000 -85.48000 2020-12-02T08:11:30-0600
## 2    al 32.54750 -85.46820 2020-12-02T02:11:50-0600
## 3    al 32.61681 -85.46415 2020-12-01T19:50:41-0600
## 4    al 32.86160 -85.21610 2020-12-01T15:54:45-0600
## 5    al 32.54750 -85.46820 2020-12-01T12:53:56-0600

Deskriptívna štatistika

summary(data)
##        id                 url               region           region_url       
##  Min.   :7208549803   Length:458213      Length:458213      Length:458213     
##  1st Qu.:7231952523   Class :character   Class :character   Class :character  
##  Median :7236408504   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :7235233427                                                           
##  3rd Qu.:7239320847                                                           
##  Max.   :7241019367                                                           
##                                                                               
##      price                 year      manufacturer          model          
##  Min.   :         1   Min.   :1900   Length:458213      Length:458213     
##  1st Qu.:      5995   1st Qu.:2008   Class :character   Class :character  
##  Median :     12394   Median :2013   Mode  :character   Mode  :character  
##  Mean   :     43635   Mean   :2011                                        
##  3rd Qu.:     22900   3rd Qu.:2016                                        
##  Max.   :3615215112   Max.   :2021                                        
##  NA's   :33753        NA's   :1050                                        
##   condition          cylinders             fuel              odometer         
##  Length:458213      Length:458213      Length:458213      Min.   :         0  
##  Class :character   Class :character   Class :character   1st Qu.:     40877  
##  Mode  :character   Mode  :character   Mode  :character   Median :     87641  
##                                                           Mean   :    101670  
##                                                           3rd Qu.:    134000  
##                                                           Max.   :2043755555  
##                                                           NA's   :55303       
##  title_status       transmission           VIN               drive          
##  Length:458213      Length:458213      Length:458213      Length:458213     
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##      size               type           paint_color         image_url        
##  Length:458213      Length:458213      Length:458213      Length:458213     
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  description           state                lat              long        
##  Length:458213      Length:458213      Min.   :-82.61   Min.   :-164.09  
##  Class :character   Class :character   1st Qu.: 34.60   1st Qu.:-110.89  
##  Mode  :character   Mode  :character   Median : 39.24   Median : -88.31  
##                                        Mean   : 38.53   Mean   : -94.38  
##                                        3rd Qu.: 42.48   3rd Qu.: -81.02  
##                                        Max.   : 82.05   Max.   : 150.90  
##                                        NA's   :7448     NA's   :7448     
##  posting_date      
##  Length:458213     
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

Ako významné atribúty, s ktorými budeme dalej pracovať, sme vybrali:

  • price
  • year
  • manufacturer
  • model
  • condition
  • cylinders
  • fuel
  • odometer
  • transmission
  • vin
  • drive
  • size
  • type

Analýza atribútov

Cena (price)

Pozrieme sa či v atribúte máme nejaké prázdné hodnoty:

sum(is.na(data$price))
## [1] 33753

Pozrieme sa na vychýlené hodnoty v atribúte

boxplot(data$price, las=3)

Vidíme, že v atribúte price máme zopár dosť vysokých hodnôt. Buď ide o luxusné autá, alebo o “špinavé” hodnoty.

data[order(-data$price),][1:5,c('manufacturer','model','year','price')]
##        manufacturer                 model year      price
## 385435    chevrolet      silverado 2500hd 2006 3615215112
## 425189         jeep                  <NA> 2003 2857993261
## 38376           gmc                  <NA> 2020 2808348671
## 1623      chevrolet                  <NA> 1955 1234567890
## 306218          ram 3500 crewcab tradesma 2018  123456789

Ako sme sa mohli presvedčiť, o žiadne Ferarri ani Lamborghini nejde, sú to nesprávne vyplnené dáta.

nrow(data[is.null(data$price),])
## [1] 0
nrow(data[is.na(data$price),])
## [1] 33753

Čo nás však neteší je však, ako sme sa mohli presvedčiť, aj fakt že dosť vela hodnôt (33 753) tohto atribútu je nulových (NA). Domnievame sa, že ide o umelo znížené alebo nešpecifikované sumy, práve z dôvodu rankingu v rebríčkoch inzerátov. Preto sme ich (nuly) v úvode, pri načítaní datasetu, nahradili NA. Vysvetľujeme to tým, že táto praktika sa používa pokiaľ chcete mať inzeráty na prvých stránkach, pretože ludia bežne hľadajú od najnižšej ceny po svoj cenový strop.

Okrem toho tam máme vysoké hodnoty, s ktorými sa popasujeme vo fáze čistenia dát.

boxplot(data$price, las=3)

ggplot(data = data, aes(sample=price)) +
  stat_qq() + 
  stat_qq_line() +
  scale_y_continuous(breaks = seq(0, 5000000, by = 50000)) 
## Warning: Removed 33753 rows containing non-finite values (stat_qq).
## Warning: Removed 33753 rows containing non-finite values (stat_qq_line).

Rok (year)

Počet prázdnych hodnôt:

sum(is.na(data$year))
## [1] 1050

Ako možeme vidieť, atribút rok má 1050 prázdnych hodnôt.

Jednou z možností ktorú sa domnievame je že v Amerike je bežné že sa autá prestavujú, je možné že predávajúci rok neudal z dôvodu, takejto prestavby, kde rok nehrá žiadnu rolu, napríklad: Karoséria vozidla je z roku 1970 a implementovaná technika z roku 2018.

data[is.na(data$year),][1:5,]
##             id
## 16  7236904120
## 384 7238204872
## 470 7237486859
## 485 7237299924
## 850 7235124536
##                                                                                             url
## 16  https://auburn.craigslist.org/ctd/d/royal-palm-beach-2019-ram-1500-big-horn/7236904120.html
## 384     https://bham.craigslist.org/ctd/d/new-castle-2019-nissan-sentra-cvt-gun/7238204872.html
## 470        https://bham.craigslist.org/ctd/d/vicksburg-2019-chevrolet-silverado/7237486859.html
## 485     https://bham.craigslist.org/ctd/d/new-castle-2018-jeep-compass-latitude/7237299924.html
## 850     https://bham.craigslist.org/ctd/d/new-castle-2018-toyota-highlander-xle/7235124536.html
##         region                    region_url price year manufacturer
## 16      auburn https://auburn.craigslist.org 38500   NA         <NA>
## 384 birmingham   https://bham.craigslist.org 14500   NA         <NA>
## 470 birmingham   https://bham.craigslist.org 41800   NA         <NA>
## 485 birmingham   https://bham.craigslist.org 18700   NA         <NA>
## 850 birmingham   https://bham.craigslist.org 28900   NA         <NA>
##                     model condition   cylinders   fuel odometer title_status
## 16                    500      <NA> 8 cylinders    gas    28246        clean
## 384              n Sentra      <NA> 4 cylinders    gas    22546        clean
## 470 olet Silverado 2500HD      <NA> 8 cylinders diesel    80910         <NA>
## 485               Compass      <NA> 4 cylinders    gas    18316        clean
## 850          a Highlander      <NA> 6 cylinders    gas    63061        clean
##     transmission               VIN drive size   type paint_color
## 16     automatic 1C6RREMT7KN655834   rwd <NA> pickup       white
## 384    automatic 3N1AB7AP9KY380549   fwd <NA>  sedan        grey
## 470    automatic 1GC1KSEY9KF121232   4wd <NA> pickup       white
## 485    automatic 3C4NJCBB4JT108564   fwd <NA>    SUV         red
## 850    automatic 5TDKZRFH9JS546309   fwd <NA>    SUV       white
##                                                              image_url
## 16  https://images.craigslist.org/00Y0Y_65ISqDroMwD_0kE0bC_600x450.jpg
## 384 https://images.craigslist.org/00d0d_4rIc0Iq9E49_0kE0dM_600x450.jpg
## 470 https://images.craigslist.org/00b0b_jsyRZLmAUwW_0kE0fu_600x450.jpg
## 485 https://images.craigslist.org/00q0q_8PVU7cjYgfw_0kE0dN_600x450.jpg
## 850 https://images.craigslist.org/00S0S_iAz66PJkfz0_0kE0dL_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              description
## 16  2019 *Ram* *1500* Big Horn/Lone Star 4x2 Crew Cab 6'4" Box Truck - $38,500Call Us Today! 561-693-0621Text Us Today! 561-203-4849Ram_ 1500_ For Sale by Peterson Motorcars Call For The Best Deals Today!  Vehicle Description For This *Ram* *1500*PETERSON MOTORCARS USED TRUCKS FOR SALE WEST PALM BEACH FL 33409Clean carfax, one owner, Florida truck, Big horn Sport, 6'4" box, 8.4" touch screen with apple carplay, bluetooth, usb, mp3, backup camera, power sliding rear window, power seats, keyless go, 5.7L v8, Sport appearance package, power adjustable pedals, level 1 equiptment group, power folding mirrors, 3.92 axle ratio, anti spin rear differential and so much more! Call 561 371 5504 or visit www.PetersonMotorcars.com for more information and photos!  PETERSON MOTORCARS USED TRUCKS FOR SALE WEST PALM BEACH FL 33409  View additional pictures and details This Ram_ 1500_ at: http://www.petersonmotorcars.com/details-2019-ram-1500-big_horn_lone_star_4x2_crew_cab_6_4_box-used-1c6rremt7kn655834.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist  Vehicle Details For This *Ram* *1500*       Year: 2019     Make: Ram     Model: 1500     Trim: Big Horn/Lone Star 4x2 Crew Cab 6'4" Box     VIN: 1C6RREMT7KN655834     Stock#: X655834     Condition: Used Clear Title               Miles: 28,246          Exterior Color: Bright White Clearcoat     Interior Color: Black     Engine: 5.7L 8 CYLINDER      Transmission: 8 Spd Automatic     Drivetrain: Rear Wheel Drive     Ram        Installed Options & Packages For This *Ram* *1500*                      ENGINE: 5.7L V8 HEMI MDS VVT EZH                                       -  Hemi Badge                      Dual Rear Exhaust w/Bright Tips                      180 Amp Alternator                      Heavy Duty Engine Cooling                      Active Noise Control System                              TRANSMISSION: 8-SPEED AUTOMATIC (850RE) DFT                                                TRANSMISSION: 8-SPEED AUTOMATIC (8HP75) DFR                                                QUICK ORDER PACKAGE 24Z BIG HORN/LONE STAR 24Z                                       -  Engine: 5.7L V8 HEMI MDS VVT                      Transmission: 8-Speed Automatic (8HP75)                      Steering Wheel Mounted Audio Controls                              3.92 REAR AXLE RATIO DMH                                                BRIGHT WHITE CLEARCOAT PW7                                                SPORT APPEARANCE PACKAGE AEF                                       -  Body Color Door Handles                      Tires: 275/55R20 OWL All Season                      Grille B/Color Outline 1 Texture 2                      Body Color Rear Bumper w/Step Pads                      Exterior Mirrors Courtesy Lamps                      Black Interior Accents                      Auto Dim Exterior Driver Mirror                      Body Color Front Bumper                      Exterior Mirrors w/Supplemental Signals                      Exterior Mirrors w/Memory                      Power-Folding Mirrors                      Power Heated Fold-Away Mirrors                              BIG HORN LEVEL 1 EQUIPMENT GROUP A62                                       -  Rear Window Defroster                      Cluster 3.5" TFT Color Display                      Power 8-Way Driver Seat                      Rear Power Sliding Window                      Sun Visors w/Illuminated Vanity Mirrors                      Glove Box Lamp                      Integrated Center Stack Radio                      Class IV Receiver Hitch                      Single Disc Remote CD Player                      Power 4-Way Driver Lumbar Adjust                      Power Adjustable Pedals                      Foam Bottle Insert (Door Trim Panel)                      Google Android Auto                      For More Info                      Call 800-643-2112                      Exterior Mirrors Courtesy Lamps                      1-Year SiriusXM Radio Service                      Auto Dim Exterior Driver Mirror                      Radio: Uconnect 4 w/8.4" Display                      SiriusXM Satellite Radio                      Exterior Mirrors w/Supplemental Signals                      Big Horn IP Badge                      Rear Dome w/On/Off Switch Lamp                      Universal Garage Door Opener                      Power Heated Fold Away Mirrors                      Rear View Auto Dim Mirror                      8.4" Touchscreen Display                      Power-Folding Mirrors                      Apple CarPlay                              ANTI-SPIN DIFFERENTIAL REAR AXLE DSA                                               Ram   About Us      Peterson Motorcars CORPORATE OFFICES 1844 Church St  West Palm Beach, FL 33409  Call NOW to Reserve this Ram_ 1500_! 561-693-0621Text NOW to Reserve this Ram_ 1500_! 561-203-4849   *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *PETERSON MOTORCARS USED TRUCKS FOR SALE WEST PALM BEACH FL 33409* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *For Sale* *Clean* *Bright White Clearcoat* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *Cheap* *Like New* *Rear Wheel Drive* *5.7L 8 CYLINDER * *Used* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box*
## 384                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             2019 *Nissan* *Sentra* S CVT Sedan - $14,500Call or Text Us Today! 205-512-6569\t Nissan_ Sentra_ For Sale by World Class MotorsFor Financing - step 1 is to complete our short online application @ WorldClassApproval.com     Vehicle Description For This *Nissan* *Sentra*We finance | 1-Owner, clean Carfax - like brand NEW 2019 Nissan Sentra S CVT sedan! Only 22K miles & still covered under factory new car warranty. Excellent daily driver that is reliable & gets 37-mpg! We finance. Low rates. Call or text 256-595-9403 for more info. Apply now @ WorldClassApproval.com. Trades welcome. Shipping available. 1920 Decatur Hwy, Gardendale, AL.View additional pictures and details This Nissan_ Sentra_ at: http://www.worldclassmotors.com/details-2019-nissan-sentra-s_cvt-used-3n1ab7ap9ky380549.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist  Vehicle Details For This *Nissan* *Sentra*       Year: 2019     Make: Nissan     Model: Sentra     Trim: S CVT     VIN: 3N1AB7AP9KY380549     Stock#: 283833     Condition: Used Clear Title               Miles: 22,546          Exterior Color: Gun Metallic     Interior Color: Charcoal     Engine: 1.8L 4 CYLINDER     Transmission: CVT     Drivetrain: Front Wheel Drive     Nissan        Features & Options For This *Nissan* *Sentra*                  Ext / Int Color                               Gun Metallic with Charcoal Cloth Interior                      Luxury Features                               Cruise Control                 Remote Trunk Lid                 Steering Wheel Radio Controls                 Telescoping Steering Wheel                 Tire Pressure Monitor                      Power Equipment                               Power Mirrors                 Power Steering                      Safety Features                               Child Proof Door Locks                 Driver's Air Bag                 Intermittent Wipers                 Passenger Air Bag                 Rear Defogger                 Roll Stability Control                 Side Air Bags                 Side Curtain Airbags                      Interior                               Center Arm Rest                 Clock                 Overhead Console                 Tachometer                 Vanity Mirrors                      Exterior                               Remote Fuel Door                 Sliding Rear Window                      Audio / Video                               AM/FM                 Bluetooth                 CD Player                 Reverse Camera                 Touch Screen                Nissan   About Us      World Class Motors 1920 Decatur Highway  Gardendale, AL 35071  Call or Text NOW to Reserve this Nissan_ Sentra_! 205-512-6569\t    *Nissan* *Sentra* *S CVT* *Nissan* *Sentra* *S CVT* *For Sale* *Clean* *Gun Metallic* *Nissan* *Sentra* *S CVT* *Cheap* *Like New* *Front Wheel Drive* *1.8L 4 CYLINDER* *Used* *Nissan* *Sentra* *S CVT* *Nissan* *Sentra* *S CVT* *Nissan* *Sentra* *S CVT*
## 470                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             2019 *Chevrolet* *Silverado 2500HD* CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD Truck - $41,800Call Us Today! 601-374-6258Chevrolet_ Silverado 2500HD_ For Sale by George Carr Buick GMC  Vehicle Description For This *Chevrolet* *Silverado 2500HD*4X4 2500HD DURAMAX DIESELVERY VERY CLEAN, ONE OWNER. OFF LEASE. COMPLETELY SERVICED AND INSPECTED BY OUR GM CERTIFIED TECHNICIANS. B&W GOOSENECK HITCH. BUILT IN BRAKE CONTROLLER. NEW COOPER DISCOVER TIRES. READY TO GET DOWN TO WORK. FINANCING AVAILABLE (W.A.C.). View additional pictures and details This Chevrolet_ Silverado 2500HD_ at: http://www.georgecarrworktrucks.com/details-2019-chevrolet-silverado_2500hd-crew_cab_4x4_duramax_diesel_2500_2500hd-used-1gc1ksey9kf121232.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist  Vehicle Details For This *Chevrolet* *Silverado 2500HD*       Year: 2019     Make: Chevrolet     Model: Silverado 2500HD     Trim: CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD     VIN: 1GC1KSEY9KF121232     Stock#: P15011     Condition: Used Clear Title               Miles: 80,910          Exterior Color: Summit White     Interior Color: Jet Black/Medium Ash Gray Piping and Stitching     Engine: 6.6L 8 CYLINDER TURBOCHARGED     Transmission: 6 Spd Automatic     Drivetrain: Four Wheel Drive     Chevrolet        Installed Options & Packages For This *Chevrolet* *Silverado 2500HD*                      LT PREFERRED EQUIPMENT GROUP 1LT                                       -  Standard Equipment                             Chevrolet   About Us      George Carr Buick GMC Contact: ROBERT LANDRY 2950 S. Frontage Rd. Vicksburg, MS 39180  Call NOW to Reserve this Chevrolet_ Silverado 2500HD_! 601-374-6258   *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *4X4 2500HD DURAMAX DIESEL* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *For Sale* *Clean* *Summit White* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *Cheap* *Like New* *Four Wheel Drive* *6.6L 8 CYLINDER TURBOCHARGED* *Used* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD*
## 485                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              2018 *Jeep* *Compass* Latitude FWD SUV - $18,700Call or Text Us Today! 205-512-6569\t Jeep_ Compass_ For Sale by World Class MotorsFor Financing - step 1 is to complete our short online application @ WorldClassApproval.com     Vehicle Description For This *Jeep* *Compass*Like brand NEW 1-Owner, clean Carfax 2018 Jeep Compass Latitude! Local new car trade-in. Loaded with navigation, backup camera, power driver seat, keyless alarm, premium audio, Sirius XM satellite radio, Bluetooth, music interphase & more. Serviced, inspected & comes with warranty. We finance. Low rates! Call or text 256-595-9403 for more info. Apply now @ WorldClassApproval.com. Trades welcome. Shipping available. 1920 Decatur Hwy, Gardendale, AL.View additional pictures and details This Jeep_ Compass_ at: http://www.worldclassmotors.com/details-2018-jeep-compass-latitude_fwd-used-3c4njcbb4jt108564.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist  Vehicle Details For This *Jeep* *Compass*       Year: 2018     Make: Jeep     Model: Compass     Trim: Latitude FWD     VIN: 3C4NJCBB4JT108564     Stock#: 283886     Condition: Used Clear Title               Miles: 18,316          Exterior Color: Redline Pearlcoat     Interior Color: Black     Engine: 2.4L 4 CYLINDER     Transmission: 6 Spd Automatic     Drivetrain: Front Wheel Drive     Jeep        Installed Options & Packages For This *Jeep* *Compass*                      ENGINE: 2.4L I4 PZEV M-AIR W/ESS EDE                                                TRANSMISSION: 6-SPEED AISIN F21-250 GEN 3 AUTO DF7                                                QUICK ORDER PACKAGE 28J 28J                                       -  Engine: 2.4L I4 PZEV M-Air w/ESS                      Transmission: 6-Speed Aisin F21-250 Gen 3 Auto                              REDLINE PEARLCOAT PRM                                                NAVIGATION GROUP AMA                                       -  USB Host Flip                      Google Android Auto                      Premium Air Filter                      For More Info                      Call 800-643-2112                      Radio: Uconnect 4C Nav w/8.4" Display                      Integrated Center Stack Radio                      1-Year SiriusXM Radio Service                      SiriusXM Satellite Radio                      GPS Antenna Input                      Air Conditioning ATC w/Dual Zone Control                      Apple CarPlay                      Humidity Sensor                              POPULAR EQUIPMENT GROUP ANF                                       -  Cluster 7.0" Color Driver Info Display                      115V Auxiliary Power Outlet                      7.0" Touch Screen Display                      Remote Start System                      Rear View Auto Dim Mirror                      Power 8-Way Driver/Manual 6-Way Passenger Seats                      4-Way Power Lumbar Adjust                              POWER 8-WAY DRIVER/MANUAL 6-WAY PASSENGER SEATS JPR                                       -  4-Way Power Lumbar Adjust                             Jeep   About Us      World Class Motors 1920 Decatur Highway  Gardendale, AL 35071  Call or Text NOW to Reserve this Jeep_ Compass_! 205-512-6569\t    *Jeep* *Compass* *Latitude FWD* *Jeep* *Compass* *Latitude FWD* *For Sale* *Clean* *Redline Pearlcoat* *Jeep* *Compass* *Latitude FWD* *Cheap* *Like New* *Front Wheel Drive* *2.4L 4 CYLINDER* *Used* *Jeep* *Compass* *Latitude FWD* *Jeep* *Compass* *Latitude FWD* *Jeep* *Compass* *Latitude FWD*
## 850                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          2018 *Toyota* *Highlander* XLE V6 FWD SUV - $28,900Call or Text Us Today! 205-512-6569\t Toyota_ Highlander_ For Sale by World Class MotorsFor Financing - step 1 is to complete our short online application @ WorldClassApproval.com     Vehicle Description For This *Toyota* *Highlander*1-Owner, clean Carfax - like brand NEW 2018 Toyota Highlander V6 XLE finished in pearl white over black premium leather interior! Features include a 3.5L V6 engine, full power heated leather seats, 3rd row, navigation, backup camera, blind spot assist, rear climate control, power sunroof, power trunk, premium audio, Sirius XM satellite radio, Bluetooth, music interphase, premium alloys, rear bucket seats & more! Fully serviced, inspected & comes with warranty. We finance. Low competitive rates! Call or text 256-595-9403 for more info. Apply now @ WorldClassApproval.com. Trades welcome. Shipping available. 1920 Decatur Hwy, Gardendale, AL.View additional pictures and details This Toyota_ Highlander_ at: http://www.worldclassmotors.com/details-2018-toyota-highlander-xle_v6_fwd-used-5tdkzrfh9js546309.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist  Vehicle Details For This *Toyota* *Highlander*       Year: 2018     Make: Toyota     Model: Highlander     Trim: XLE V6 FWD     VIN: 5TDKZRFH9JS546309     Stock#: 283875     Condition: Used Clear Title               Miles: 63,061          Exterior Color: Blizzard Pearl     Interior Color: Black     Engine: 3.5L V6 CYLINDER     Transmission: 8 Spd Automatic     Drivetrain: Front Wheel Drive     Toyota        Installed Options & Packages For This *Toyota* *Highlander*                      BLIZZARD PEARL 070                                                SECURITY SYSTEM V5                                               Toyota   About Us      World Class Motors 1920 Decatur Highway  Gardendale, AL 35071  Call or Text NOW to Reserve this Toyota_ Highlander_! 205-512-6569\t    *Toyota* *Highlander* *XLE V6 FWD* *Toyota* *Highlander* *XLE V6 FWD* *For Sale* *Clean* *Blizzard Pearl* *Toyota* *Highlander* *XLE V6 FWD* *Cheap* *Like New* *Front Wheel Drive* *3.5L V6 CYLINDER* *Used* *Toyota* *Highlander* *XLE V6 FWD* *Toyota* *Highlander* *XLE V6 FWD* *Toyota* *Highlander* *XLE V6 FWD*
##     state      lat      long             posting_date
## 16     al 26.70385 -80.08200 2020-11-25T11:52:03-0600
## 384    al 33.66960 -86.81762 2020-11-28T10:01:51-0600
## 470    al 32.33205 -90.85716 2020-11-26T19:10:40-0600
## 485    al 33.66960 -86.81762 2020-11-26T10:12:22-0600
## 850    al 33.66960 -86.81762 2020-11-22T04:40:36-0600

Ako môžeme vidieť, v atribúte description je častokrát uvádzaný aj rok daného automobilu, čiže prichádza du úvahy, že by sme mohli tento rok extrahovať namiesto týchto chýbajúcich hodnôt z tohto atribútu.

ggplot(data = data[!is.na(data$year),], aes(x=year)) + 
  geom_histogram(bins = 121, fill= 6, color="#ffffff") +
  xlab("Rok") +
  ylab("Frekvencia") +
  scale_y_continuous(breaks = seq(0, 500000, by = 20000)) +
  scale_x_continuous(breaks = seq(1900, 2021, by = 5)) +
  theme(axis.text.x = element_text(angle = 90))

Môžme vidieť, že dáta rokov nepochádzajú z normálneho rozdelenia. Starších vozidiel (pod 1990) aj novších máme nedostatok, naopak - čo sa dalo aj očakávať, vozidiel z posledných troch dekád je najviac. Nájdu sa tu aj vozidlá staršie, ale tie sa prevažne inzerujäú na špecializovaných fórach, napríklad pre veterány.

Toto rozdelenie sa dalo predpokladať, kedže priemerný vek vozidla je 12 rokov.

Výrobca (manufacturer)

Počet prázdnych hodnôt.

nrow(data[is.na(data$manufacturer),])
## [1] 18220

Atribút obsahuje prázdne celkovo až 18 tisíc prázdnych záznamov výrobcu, alebo sa dáta stratili pri exportoch.

Skúsime si vytvoriť histogram podľa frekvencie daných výrobcov:

group_manu <- data %>%
  group_by(manufacturer) %>%
  summarize(frequency = n())

group_manu[order(-group_manu$frequency),][1:10,]
## # A tibble: 10 x 2
##    manufacturer frequency
##    <chr>            <int>
##  1 ford             79666
##  2 chevrolet        64977
##  3 toyota           38577
##  4 honda            25868
##  5 nissan           23654
##  6 jeep             21165
##  7 <NA>             18220
##  8 ram              17697
##  9 gmc              17267
## 10 dodge            16730

Vidíme že Amerika nesklamala, a rebríčku kraľuje domáci výrobca - Ford.

ggplot(data = group_manu, aes(x = manufacturer, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Výrobca")  +
  theme(axis.text.x = element_text(angle = 90)) +
  scale_y_continuous(breaks = seq(0, 100000, by = 5000)) 

Model (Model)

Počet prázdnych hodnôt:

nrow(data[is.na(data$model),])
## [1] 4847

Máme približne 5 tisíc záznamov, ktoré nemajú vyplnený stĺpec model. V nasledujúcej tabuľke síce vidíme že niektoré z dát obsahujú názvy modelov v popise inzerátu (description).

data[is.na(data$model),][1:5,]
##             id
## 43  7232651921
## 206 7239785162
## 212 7239719006
## 368 7238346972
## 388 7238172849
##                                                                                           url
## 43  https://auburn.craigslist.org/ctd/d/ton-service-utility-trucks-ford-chevy/7232651921.html
## 206       https://bham.craigslist.org/cto/d/gainesville-2017-ram-4x4-for-sale/7239785162.html
## 212                 https://bham.craigslist.org/cto/d/helena-2011-range-rover/7239719006.html
## 368                     https://bham.craigslist.org/cto/d/kellyton-2007-mazda/7238346972.html
## 388             https://bham.craigslist.org/cto/d/birmingham-2005-range-rover/7238172849.html
##         region                    region_url price year manufacturer model
## 43      auburn https://auburn.craigslist.org    NA 2014          ram  <NA>
## 206 birmingham   https://bham.craigslist.org 18000 2017          ram  <NA>
## 212 birmingham   https://bham.craigslist.org 22000 2011        rover  <NA>
## 368 birmingham   https://bham.craigslist.org  3400 2006        mazda  <NA>
## 388 birmingham   https://bham.craigslist.org  6900 2005        rover  <NA>
##     condition   cylinders   fuel odometer title_status transmission  VIN drive
## 43       <NA>        <NA> diesel        0        clean    automatic <NA>  <NA>
## 206      <NA>        <NA>    gas    95000        clean    automatic <NA>  <NA>
## 212      <NA> 8 cylinders    gas    79000        clean    automatic <NA>   4wd
## 368      <NA>        <NA>    gas       NA      rebuilt    automatic <NA>  <NA>
## 388      <NA>        <NA>    gas       NA        clean    automatic <NA>  <NA>
##     size  type paint_color
## 43  <NA> other       white
## 206 <NA>  <NA>        <NA>
## 212 <NA>  <NA>       black
## 368 <NA>  <NA>        <NA>
## 388 <NA>  <NA>        <NA>
##                                                              image_url
## 43  https://images.craigslist.org/00303_eLTsWH0uS84_0gw0co_600x450.jpg
## 206 https://images.craigslist.org/00n0n_cZ7RUc9IPIl_0t20CI_600x450.jpg
## 212 https://images.craigslist.org/00j0j_jesJuu1ztPT_0CI0t2_600x450.jpg
## 368 https://images.craigslist.org/00I0I_l2zYOK4B5Vz_0CI0t2_600x450.jpg
## 388 https://images.craigslist.org/00606_fJy6wrbrq4Z_0CI0t2_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           description
## 43  All Trucks USA12106 Old River RdRockton, IL 61072Ask for: Craigslist SalesMain: (815) 624-1400Light Duty Service Trucks + Commercial Truck Super Store www.AllTrucksUSA.comPrice: Call for PricingDescription:     ****SEE A TRUCK FOR SALE AND WANT MORE PHOTOS? JUST TYPE THE SIX DIGIT   STOCK NUMBER IN THE SEARCH BAR AT www.AllTrucksUSA.com / OR CALL   815-624-1400 FOR PRICING OR QUICK ANSWERS TO ANY QUESTIONS. FINANCING   & DELIVERY AVAILABLE****----------------------------------------------------------------------------------------------------------------------------------------------Stock# B59285 - 2013 FORD F350 2WD REGULAR CAB SERVICE TRUCK, 6.2L V8 GAS, AUTOMATIC, 11' KNAPHEIDE UTILITY BODY w/ FLIP UP STORAGE LIDS, LADDER RACK, CLOTH BUCKET SEATS, TILT STEERING, CRUISE CONTROL, A/C, AM/FM RADIO, TRACTION CONTROL, 14,000 lb GVW / 122,682 MILESStock# B76324 - 2013 FORD F350 4X4 REGULAR CAB SERVICE TRUCK, 6.7L V8 POWER STROKE TURBO   DIESEL, AUTOMATIC, 9' KNAPHEIDE UTILITY BODY, HITCH RECEIVER, FACTORY   BRAKE CONTROLLER, POWER WINDOWS LOCKS AND MIRRORS, VINYL BUCKET SEATS,   CRUISE CONTROL, TILT STEERING, A/C, AM/FM RADIO, 4WD, 14,000 lb GVW /   67,923 MILESStock# 142845 - 2015 CHEVY 3500HD 4X4 CREW CAB SERVICE TRUCK, 6.0L V8 GAS, AUTOMATIC, 9'   KNAPHEIDE UTILITY BODY, DUALLY, HITCH RECEIVER, FACTORY BRAKE   CONTROLLER, VINYL BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, CRUISE   CONTROL, TILT STEERING, AM/RM RADIO, TRACTION CONTROL, 4WD, 13,200 lb   GVW / 152,637 MILESStock# C97015 - 2011 FORD F250 2WD EXTENDED CAB SERVICE TRUCK, 6.2L V8 GAS & CNG FUEL, AUTOMATIC, 8' STEELWELD UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, CLOTH BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, CRUISE CONTROL, DIFFERENTIAL LOCK, A/C, TRACTION CONTROL, LADDER RACK, 10,000 lb GVW / 176,743 MILES  -- $19,950  \t\t\t\t\t  \t\t\t\t\t\t  \t\t\t\t  \t\t\t\t\t\t\t\t  \t\t\t\t\t\t\t\t  \t\t\t\tStock#   C71616 - 2012 FORD F350 2WD EXTENDED CAB SERVICE TRUCK, 6.2L V8 GAS,   AUTOMATIC, 9' ETI UTILITY BODY w/ FLIP UP STORAGE LID, PINTLE HITCH,   CLOTH BUCKET SEATS, POWER WINDOWS LOCKS AND MIRRORS, CRUISE CONTROL, CD   PLAYER RADIO, TRACTION CONTROL, 13,300 lb GVW / 168,419 MILES Stock# 138752 - 2012 DODGE RAM 3500HD 4X4 REGULAR CAB SERVICE TRUCK, 5.7L HEMI V8 GAS, AUTOMATIC, 9' RAWSON KOENIG UTILITY BODY, HITCH RECEIVER, VINYL BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, CD PLAYER RADIO, TOW HAUL, A/C, 4WD, DUALLY (DRW), 12,500 lb GVW / 157,065 MILESStock# 134948 - 2013 GMC SIERRA 3500HD 4X4 CREW CAB SERVICE TRUCK, 6.6L V8 DURAMAX TURBO DIESEL, AUTOMATIC, 9' ROYAL UTILITY BODY w/ FLIP TOP STORAGE, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, POWER WINDOWS LOCKS AND MIRRORS, CLOTH BUCKET SEATS, CRUISE CONTROL, AM/FM RADIO, PTO CAPABLE, TILT STEERING, 4WD, 13,200 lb GVW / 114,998 MILES /Stock# 225534 - 2011 GMC 2500HD 4X4 CREW CAB MECHANICS TRUCK, 6.6L V8 DURAMAX TURBO   DIESEL, AUTOMATIC, 3,200 lb RKI CRANE, 8' KNAPHEIDE UTILITY BODY, 2   OUTRIGGERS, CLOTH BUCKET SEATS, CRUISE CONTROL, FACTORY BRAKE   CONTROLLER, AM/FM RADIO, TRACTION CONTROL, EXHAUST BRAKE, 160,464 MILESStock# A75556 - 2016 FORD TRANSIT 3500 HD REGULAR CAB CUTAWAY SERVICE VAN, 3.7L V6 GAS, AUTOMATIC, 11' READING ENCLOSED UTILITY BODY, 10' INSIDE FLOOR LENGTH, 48.5'' FLOOR WIDTH, 4' 11'' STANDING HEIGHT, VINYL BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, A/C, CD PLAYER RADIO, BLUETOOTH, BACK-UP CAMERA, 2WD, DUALLY (DRW), 9,950 lb GVW / 131,719 MILES Stock# 305423 - 2013 GMC 3500 HD 4X4 SERVICE TRUCK, 6.0L VORTEC V8 GAS, AUTOMATIC, 8' STAHL UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, CLOTH BUCKET SEATS, TILT STEERING, POWER WINDOWS MIRRORS AND LOCKS, CRUISE CONTROL, CD PLAYER RADIO, A/C, DUALLY (DRW), 13,025 lb GVW / 182,382 MILESStock# 326792 - 2012 CHEVY 3500HD 2WD EXTENDED CAB ENCLOSED SERVICE TRUCK, 6.0L VORTEC V8 GAS, 8' ENCLOSED BRAND FX UTILITY BODY, 4' FLOOR WIDTH, 6' STANDING HEIGHT, POWER INVERTER, PINTLE HITCH, FACTORY BRAKE CONTROLLER, VINYL BUCKET SEATS, CRUISE CONTROL, TILT STEERING, CD PLAYER, A/C, TRACTION CONTROL, SINGLE REAR WHEEL (SRW), 10,000 lb GVW / 160,413 MILESStock# 139072 - 2012 CHEVY 2500HD 4X4 EXTENDED CAB SERVICE TRUCK, 6.0L VORTEC V8 GAS,   AUTOMATIC, 9' KNAPHEIDE UTILITY BODY, HITCH RECEIVER, LADDER RACK, CLOTH   BUCKET SEATS, TILT STEERING, A/C, CRUISE CONTROL, RADIO, TRACTION   CONTROL, 4WD, SINGLE REAR WHEEL (SRW), 9,500 lb GVW / 223,064 MILESStock# B66477 - 2011 FORD F350 2WD REGULAR CAB SERVICE TRUCK, 6.7L POWERSTROKE TURBO   DIESEL, AUTOMATIC, 9' ETI UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE   CONTROLLER, VINYL BUCKET SEATS, TILT STEERING, CRUISE CONTROL, A/C, CD   PLAYER RADIO, TRACTION CONTROL, DUALLY (DRW), 13,300 lb GVW / 206,192   MILES                     Stock# 237861 - 2012 GMC 2500 HD 4X4 EXTENDED CAB SERVICE TRUCK, 6.0L VORTEC V8 GAS,   AUTOMATIC, 8' KNAPHEIDE UTILITY BODY, HITCH RECEIVER, CLOTH BUCKET   SEATS, A/C, CRUISE CONTROL, AM/FM RADIO, TOW HAUL, 4WD, SINGLE REAR   WHEEL (SRW), 9,500 lb GVW / 167,757 MILESAll Trucks USAAsk for: Craigslist Sales☎ (815) 624-140012106 Old River Rd Rockton, IL 61072Basic Information:Year: 2014Make: FordModel: FORD CHEVY GMC DODGE RAMStock Number: 333333Condition: UsedType: Service, UtilityClass: Class 3 (10,001-14,000 Lbs.)Color: WHITEMileage: 0Cab Type: REGULAR / STANDARDPassengers: 3Body Type: Regular CabTrim: POWERSTROKE DURAMAX CUMMINS DIESEL AND GASCondition:Title: ClearEngine:Fuel Type: DieselEngine Make: FordEngine Description: 6.2L V8DriveTrain:Transmission Type: AutomaticSuspension:Suspension Type: SpringInstrumentation:TachometerTrip OdometerIn Car Entertainment:CD PlayerAM/FM StereoSeats:Seat Upholstery: VinylSeat Type: BucketConvenience:Power Door LocksPower WindowsPower SteeringPower MirrorsTilt Steering WheelTilt/Telescoping Steering WheelCruise ControlComfort:Air Conditioning**Great selection of commercial trucks. All trucks go through our mega service center and are ready to start working and making you money. For quick answers to any questions you may have please call 815-624-1400 . We are here to help, delivery available. You can also visit our online showroom at www.AllTrucksUSA.com A27FBAFAEA464DBBB650D168074EE06C 16922656 8931296
## 206                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Great truck everything works ding on back finder 4x4 works great hwy miles $18,000 cash 💸 cell #  show contact info
## 212                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ***(Runs like new )... -New motor rebuilt (less than 400 miles since rebuilt ).! Didn’t need complete rebuild but timing chain had to be replaced so went ahead and rebuild motor.  Around $7500  -New suspension pump and sensors $2500 -Brand new tires (All terrain Nitto Terra Grappler ). $1500 New battery $250  Loaded (navigation , back up camera ,xm radio , sunroof , Bluetooth, entertainment etc  NO LOWBALLERS!!! Always an Alabama truck so zero rust ! Call or text 205281203six  Basically a new Rover except not $85k sticker price Serious buyers not tire kickers, serious buyers can drive suv to mechanich of their preference and verify mechanical condition.   Won’t be disappointed.   No scammers or help needed to sell it. Clean and clear Alabama title ready to go. Great condition only selling to buy a new one
## 368                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       2007 Mazda 6 automatic 4 cylinder 120,000 miles on it gas saver Alabama rebuilt title runs great Good tires Ac & heat work Asking $3400 OBO
## 388                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             2005 Range Rover HSE, 4X4 , V8 , only 102k miles , very good running suv ! ,  loaded , leather , sunroof, navigation, back up sensors, towing package , cold AC , heater works , clean title  $6500  Cash no traded , no finance    show contact info
##     state     lat     long             posting_date
## 43     al      NA       NA 2020-11-17T14:55:59-0600
## 206    al 32.8210 -88.1589 2020-12-01T07:50:35-0600
## 212    al 33.2663 -86.9020 2020-12-01T00:27:55-0600
## 368    al 32.9791 -86.0484 2020-11-28T13:17:42-0600
## 388    al 33.4653 -86.8082 2020-11-28T09:07:31-0600
group_model <- data %>%
  group_by(model) %>%
  summarize(frequency = n())

group_model[order(-group_model$frequency),][1:10,]
## # A tibble: 10 x 2
##    model          frequency
##    <chr>              <int>
##  1 f-150               8370
##  2 silverado 1500      5964
##  3 <NA>                4847
##  4 1500                4211
##  5 camry               4033
##  6 accord              3730
##  7 altima              3490
##  8 civic               3479
##  9 escape              3444
## 10 silverado           3090

Stav (condition)

nrow(data[is.na(data$condition),])
## [1] 192940

Stĺpec stavu vozdila nemá vyplnený skoro polovica inzerátov na craigliste ヽ(°〇°)ノ. To bude docela prúser, nakoľko sme aj na základe tohto atribútu chceli určovať trendy cien vozidiel.

length(unique(data$condition))
## [1] 7
unique(data$condition)
## [1] "good"      "excellent" NA          "like new"  "fair"      "salvage"  
## [7] "new"

Ide iba o rýchlu informáciu v akom stave auto je, pokiaľ prechádzame inzeráty.

group_cond <- data %>%
  group_by(condition) %>%
  summarize(frequency = n())

ggplot(data = group_cond, aes(x = condition, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Condition")

Počet valcov (cylinders)

nrow(data[is.na(data$cylinders),])
## [1] 171140

Vidíme že skoro štvrtina záznamov nemá vyplnený atribút s počtom valcov.

length(unique(data$cylinders))
## [1] 9
unique(data$cylinders)
## [1] "8 cylinders"  "4 cylinders"  "6 cylinders"  NA             "10 cylinders"
## [6] "other"        "5 cylinders"  "3 cylinders"  "12 cylinders"
group_cyl <- data %>%
  group_by(cylinders) %>%
  summarize(frequency = n())

ggplot(data = group_cyl, aes(x = cylinders, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Počet valcov")

Typ paliva (fuel)

nrow(data[is.na(data$fuel),])
## [1] 3237

Vidíme, že 3237 áut jazdí zadarmo. (✧ω✧)

length(unique(data$fuel))
## [1] 6
unique(data$fuel)
## [1] "gas"      "diesel"   "other"    "hybrid"   NA         "electric"
group_fuel <- data %>%
  group_by(fuel) %>%
  summarize(frequency = n())

ggplot(data = group_fuel, aes(x = fuel, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Palivo")+
  scale_y_continuous(breaks = seq(0, 500000, by = 20000)) 

Z histogramu vidíme, že väčšina áut ktoré sú inzerované su benzínové.

Počet kilometrov (odometer)

nrow(data[is.na(data$odometer),])
## [1] 55303
boxplot(data$odometer, las=3)

data[order(-data$odometer),][1:10,]
##                id
## 380797 7229317000
## 153223 7237854582
## 21880  7225071349
## 29826  7239691085
## 30283  7240189151
## 31403  7238955004
## 54968  7239456493
## 61130  7240290163
## 68619  7240659360
## 75189  7240449192
##                                                                                                    url
## 380797     https://elpaso.craigslist.org/ctd/d/el-paso-2005-gmc-sierra-2500hd-crew-cab/7229317000.html
## 153223         https://desmoines.craigslist.org/ctd/d/des-moines-2016-hyundai-veloster/7237854582.html
## 21880  https://littlerock.craigslist.org/cto/d/mount-ida-1980-jeep-cj-obo-need-to-sell/7225071349.html
## 29826                     https://imperial.craigslist.org/cto/d/brawley-1963-ss-impala/7239691085.html
## 30283   https://inlandempire.craigslist.org/cto/d/riverside-1970-oldsmobile-442-got-it/7240189151.html
## 31403        https://inlandempire.craigslist.org/cto/d/moreno-valley-2013-nissan-rogue/7238955004.html
## 54968           https://sandiego.craigslist.org/esd/cto/d/descanso-fire-truck-for-sale/7239456493.html
## 61130    https://sfbay.craigslist.org/sby/cto/d/watsonville-2003-dodge-stratus-only-82/7240290163.html
## 68619       https://cosprings.craigslist.org/cto/d/colorado-springs-1952-ford-victoria/7240659360.html
## 75189     https://pueblo.craigslist.org/cto/d/avondale-1987-jeep-grand-cherokee-laredo/7240449192.html
##                  region                          region_url price year
## 380797          el paso       https://elpaso.craigslist.org 16995 2005
## 153223       des moines    https://desmoines.craigslist.org 12995 2016
## 21880       little rock   https://littlerock.craigslist.org  4250 1980
## 29826   imperial county     https://imperial.craigslist.org  8900 1963
## 30283     inland empire https://inlandempire.craigslist.org  6000 1970
## 31403     inland empire https://inlandempire.craigslist.org  5800 2013
## 54968         san diego     https://sandiego.craigslist.org  4000 1977
## 61130       SF bay area        https://sfbay.craigslist.org  4350 2003
## 68619  colorado springs    https://cosprings.craigslist.org  4500 1952
## 75189            pueblo       https://pueblo.craigslist.org  1500 1987
##        manufacturer                    model condition   cylinders   fuel
## 380797          gmc            sierra 2500hd excellent 8 cylinders diesel
## 153223      hyundai                 veloster excellent 4 cylinders    gas
## 21880          jeep                      cj5      fair 6 cylinders    gas
## 29826     chevrolet                   impala      good       other    gas
## 30283          <NA> oldsmobile 442Oldsmobile      fair 8 cylinders    gas
## 31403        nissan                    rogue excellent 4 cylinders    gas
## 54968          <NA>        Hendrickson Truck      <NA>        <NA> diesel
## 61130         dodge                     <NA>      <NA>        <NA>    gas
## 68619          ford                 victoria      <NA>        <NA>    gas
## 75189          jeep           grand cherokee      good 6 cylinders    gas
##          odometer title_status transmission               VIN drive      size
## 380797 2043755555        clean    automatic 1GTHK23295F846900   4wd full-size
## 153223  123459789        clean    automatic KMHTC6AD5GU274883   fwd      <NA>
## 21880    10000000        clean       manual              <NA>   4wd      <NA>
## 29826    10000000      missing    automatic              <NA>   rwd full-size
## 30283    10000000        clean    automatic              <NA>   rwd      <NA>
## 31403    10000000        clean    automatic              <NA>   fwd  mid-size
## 54968    10000000        clean        other              <NA>  <NA>      <NA>
## 61130    10000000        clean    automatic              <NA>  <NA>      <NA>
## 68619    10000000        clean    automatic              <NA>  <NA>      <NA>
## 75189    10000000        clean    automatic              <NA>   4wd full-size
##          type paint_color
## 380797 pickup       white
## 153223  sedan        <NA>
## 21880    <NA>       black
## 29826   coupe        blue
## 30283   coupe        <NA>
## 31403     SUV       white
## 54968    <NA>        <NA>
## 61130    <NA>        <NA>
## 68619    <NA>        <NA>
## 75189     SUV       white
##                                                                 image_url
## 380797 https://images.craigslist.org/00w0w_lfHv2x91qs9_09G07g_600x450.jpg
## 153223 https://images.craigslist.org/00r0r_aRISNNafNmW_0ak07K_600x450.jpg
## 21880  https://images.craigslist.org/00D0D_4KJ1YF78MlF_0CI0t2_600x450.jpg
## 29826  https://images.craigslist.org/00a0a_a808owBmYbU_0CI0t2_600x450.jpg
## 30283  https://images.craigslist.org/00303_iipxxkO4Ch9_0CI0t2_600x450.jpg
## 31403  https://images.craigslist.org/00l0l_3o0WAhg5cC1_0CI0t2_600x450.jpg
## 54968  https://images.craigslist.org/00M0M_jWVD1HzcMMc_0CI0lM_600x450.jpg
## 61130  https://images.craigslist.org/01717_2itBUkVEBnf_0CI0t2_600x450.jpg
## 68619   https://images.craigslist.org/00q0q_O9i1exO5vk_0x20t2_600x450.jpg
## 75189  https://images.craigslist.org/00f0f_62UrK7Bd7wQ_0lM0t2_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        description
## 380797                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Melendez Auto Sales Inc. 7725 Alameda Ave 7712 Alameda Ave, El Paso, TX 79915Or use the link belowto view more information!http://WWW.MELENDEZAUTOSALES.COM💬💬💬 HABLAMOS ESPAÑOL. 💬💬💬Para mas informacion, llamar al ☎ (915) 772-0020 / 91577840142005 GMC Sierra 2500HD Crew Cab 153  WB 4WD SLE Pickup 2,043,755,555 miles  /  /  /    Call (or Text)  (915) 778−4014 for quick answers to your questions about this GMC Sierra 2500HD Crew Cab 153  WB 4WD SLE.***** GMC Sierra 2500HD Crew Cab 153  WB 4WD SLE Pickup *****2006, 2007, 2008, 2005, 2004, 2003, 2002, GMC, Sierra 2500HD, Envoy, Envoy XL, Safari, Savana 1500, [Model5]Disclaimer :  Call or Text 915 727 4490Call (or text) ☏ (915) 778−4014Melendez Auto Sales Inc. 7725 Alameda Ave 7712 Alameda Ave, El Paso, TX 79915Or use the link belowto view more information!http://WWW.MELENDEZAUTOSALES.COM*GMC* *Envoy* *Envoy XL* *GMC* Safari* GMC* *Savana 1500* *Automatic* *Crew Cab 153  WB 4WD SLE* *GMC* *White* *Automatic* *Pickup* *6.6L 300.0hp* *4WD* *Melendez Auto Sales Inc.* *Call us today at (915) 778−4014* *GMC Sierra 2500HD Crew Cab 153  WB 4WD SLE Pickup 4WD 6.6L 300.0hp* *GMC* *Crew Cab 153  WB 4WD SLE* *GMC Sierra 2500HD Crew Cab 153  WB 4WD SLE Pickup 4WD 6.6L 300.0hp**GMC* *White* *Automatic* *Pickup* *6.6L 300.0hp* *4WD* *Call us today at (915) 778−4014* *GMC* *White* *Automatic* *Melendez Auto Sales Inc.* *Pickup* *6.6L 300.0hp* 2001 2000 1999 1998 1997 1996
## 153223 2016  HYUNDAI VELOSTER  Sedan   \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t\tCall: (515) 262-9538 | Stock #: 74883 $12,995.00  \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t\tTom's Auto Group \t\t\t\t\t\t\t\t2136 East University Ave. \t\t\t\t\t\t\t\tDes Moines, IA 50317 \t\t\t\t\t\t\t  \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t   \t\t\t\t\t\t\t\tOr use the link below to view more information! \t\t\t\t\t\t\t\thttps://tomsautogroup.com/used-2016-hyundai-veloster-v5364948.html   \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t  \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t\tYear : 2016 \t\t\t\t\t\t\tMake : HYUNDAI \t\t\t\t\t\t\tModel : VELOSTER \t\t\t\t\t\t\tTrim :  \t\t\t\t\t\t\tMileage : 123,459,789 \t\t\t\t\t\t\tTransmission : Automatic \t\t\t\t\t\t\tExterior Color : Pacific Blue \t\t\t\t\t\t\tInterior Color : Black \t\t\t\t\t\t\tEngine : 1.6L 4 Cylinder \t\t\t\t\t\t\tVIN : KMHTC6AD5GU274883 \t\t\t\t\t\t\tStock # : 74883 \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t  \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t\tDescription of this HYUNDAI VELOSTER  Sedan \t\t\t\t\t\t\t2016 Hyundai Veloster Pacific Blue in color with Power Windows and Locks ,Tilt, Cruise, Paddle Shifter, Steering Wheel Audio Controls, Bluetooth, Cd player with AUX, Back up Camera, Fresh Detail, READY TO GO  \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t  \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t\tOptional equipment of this HYUNDAI VELOSTER  Sedan \t\t\t\t\t\t\t \t\t\t \t\t\t\t \t\t\t\t \t\t\t\t\tThis Vehicle Includes These Added Value Features! \t\t\t\t\t Back-Up Camera     Bluetooth Connection  \t\t\t\t \t\t\t \t\t\t\tVehicle Options: \t\t\t\t \t\t\t\t\tA/C    Adjustable Steering Wheel    Automatic Headlights    Back-Up Camera    Cargo Shade    Cruise Control    Driver Illuminated Vanity Mirror    Driver Vanity Mirror    Engine Immobilizer    Heated Mirrors    Intermittent Wipers    Keyless Entry    Passenger Illuminated Visor Mirror    Passenger Vanity Mirror    Power Door Locks    Power Mirror(s)    Power Steering    Power Windows    Rear Defrost    Security System    Steering Wheel Audio Controls    Trip Computer    Variable Speed Intermittent Wipers \t\t\t\t \t\t\t\t\tFront Wheel Drive    Gasoline Fuel    Transmission with Dual Shift Mode \t\t\t\t \t\t\t\t\tAM/FM Stereo    Auxiliary Audio Input    MP3 Player    Satellite Radio \t\t\t\t \t\t\t\t\t4-Wheel Disc Brakes    ABS    Brake Assist    Child Safety Locks    Daytime Running Lights    Driver Air Bag    Front Head Air Bag    Front Side Air Bag    Passenger Air Bag    Passenger Air Bag Sensor    Rear Head Air Bag    Stability Control    Traction Control \t\t\t\t \t\t\t\t\tBucket Seats    Cloth Seats    Pass-Through Rear Seat    Rear Bench Seat \t\t\t\t \t\t\t\t\tAluminum Wheels    Tire Pressure Monitor \t\t\t\t \t\t\t\t\tBluetooth Connection \t\t\t\t \t\t\t  \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t  \t\t\t\t\t\t  \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\tCopy and Paste this link for more vehicle information: \t\t\t\t\thttps://tomsautogroup.com/used-2016-hyundai-veloster-v5364948.html \t\t\t\t\t \t\t\t\t\tCall: (515) 262-9538 to get the best price! \t\t\t\t\t \t\t\t\t\tTom's Auto Group \t\t\t\t\t2136 East University Ave. \t\t\t\t\tDes Moines, IA 50317 \t\t\t\t\t \t\t\t\t\t2016 HYUNDAI VELOSTER for sale in Des Moines
## 21880                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Looking to get rid of my 1980 CJ5 its a solid jeep ready to hunt would be good on a lease or a deer camp. It runs good, it is spray on bedlined inside and out. It is definitely not a perfect jeep inline 6cyl 4speed manual shifts out good 4wd drive works good has lock out hubs. Drivers seat is in good condition the passenger seat needs recovered or replaced. I have 2 other sets of seats and a back seat for it but it needs the rear seat mount I just kept it sitting on floor for the kids to sit on when riding dirt rds. It will need brake work eventually, and steeeing box leaks I've had it 2 yrs and drive it like it is but you have to pump the brakes to stop. I have replaced the old coil type ignition with a GM style HEI distributor and it has a newer carburetor It comes with a full soft top w/doors and a bikini top both are in very good condition. Tires are like new 235 75 15 XL. The 8000lb Ramsey winch works as it should also. Asking $4250 obo Again Its not a perfect jeep but it runs drives and stops fine for a hunting ride.  Texting or email is the best way to contact me. Phone calls are spotty at best and may not be returned due to poor cell service. I will deal in person cash only no paypal, no agent pick ups or whatever and I don't need help selling
## 29826                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          PATINA! 63 SS IMPALA  project  car. Straight Body, Og paint, No motor or Tranny ,No Bondo or primer , front end included, trunk , glass, No seats, Does have dent on passenger lower front 1/4  panel & rocker. needs floorpans, No Title, Bill of Sale only. $8900
## 30283                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              I got it running and it smokes the tires . Real deal 1970 442 , needs restored has 350 motor in it now but comes with a 455  and extra transmission. Has a aftermarket ram air hood and stock hood. Tilt wheel , AC , and disc brakes. Uncut dash. Dual exhaust. I believe it is Viking blue with white interior and had a white vinyl top. It has Soft -Ray Windows    $6000  OBO          show contact info                It started up after about 10 minutes of working on it , sounds good with the dual exhaust                                                                                                                                                           Cutlass SS Chevelle GTO GS 1968 1969 1971 1972
## 31403                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    2013 Nissan Rogue 2.5L Air condition power locks doors power windows  Alarm key less enter  Alloys wheels
## 54968                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Mid to Late 1970's Fire truck for Sale. the model of truck is a Hendrickson 1871-S. (If you google that model name, you will have a idea on what it looks like) Runs good with lights and siren and PA system still working. It has a 671 Detroit diesel engine with a 5 speed transmission.an approx. 50K miles. Tires in good condition as well. It has some fire hoses along with the water cannon has a hose with it too. purchased it a couple of years ago but have not got around to getting all the other equipment with it. So now I have to part with it. Oh, it does also have a ladder and all the pump valves have been overhauled too. If interested, contact me or text me at  show contact info  for additional answers to your questions. Asking $4000.00 O.B.O.
## 61130                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Espanol / English  show contact info  No codes needed Post will be removed when gone No low ballers  2003 Stratus R/T 30 6 cyl. 3.0 motor only 82.000 miles Automatic Red on black Clean interior 4 CD changer Stereo with pleasent sound. Dent on fender Front new tires Clean Title  Just smoged Just regestered 4,300 priced for fast sale Only 82, k miles Trades welcome ( no junk ) Serious buyer only  show contact info   1998 1999 2000 2001 2002 2004 2005 2006 2007 1941 1940 1942 1939
## 68619                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                52 Victoria. Flathead V-8, automatic. Runs, drives, stops, but needs work to be road worthy. $4,500.00 or best offer. Trades considered. Call or text Only!! NO emails! seven19-338-411seven No help needed selling this car!
## 75189                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Straight 6, 4x4 ,auto trans body in very good condition.
##        state      lat       long             posting_date
## 380797    tx 31.73213 -106.36604 2020-11-11T13:45:37-0700
## 153223    ia 41.60098  -93.58129 2020-11-27T14:36:14-0600
## 21880     ar 34.56120  -93.57490 2020-11-03T17:10:09-0600
## 29826     ca 32.97400 -115.53460 2020-11-30T20:16:29-0800
## 30283     ca 33.94550 -117.37570 2020-12-01T16:30:35-0800
## 31403     ca 33.92200 -117.24900 2020-11-29T14:21:00-0800
## 54968     ca 33.05340 -116.56580 2020-11-30T12:16:19-0800
## 61130     ca 36.91020 -121.75690 2020-12-01T21:37:26-0800
## 68619     co 38.77687 -104.78125 2020-12-02T14:31:03-0700
## 75189     co 38.10250 -104.52980 2020-12-02T09:23:07-0700
data_gt_mil_odo <- data[!is.na(data$odometer) & data$odometer > 1000000,]
nrow(data_gt_mil_odo)
## [1] 388

Vidíme, že približne 400 záznamov má príliš vysoké hodnoty - máme za domienku, že sú to vymyslené dáta.

boxplot(data[data$odometer < 1000000,] $odometer, las=2)

Vidíme že po subsetovaní atribútu pod 1 milión míl sme dospeli k realistickejším dátam. Vidíme že väčšina hodnôt sa pohybuje v rozpätí od cca 45 000 až po 150 000. Medián sa pohybuje niečo pod 100k.

Typ prevodovky (transmission)

nrow(data[is.na(data$transmission),])
## [1] 2442
length(unique(data$transmission))
## [1] 4
unique(data$transmission)
## [1] "other"     "automatic" "manual"    NA

Vidíme že ide o kategorický atribút vyjadrujúci typ prevodovky.

group_trans <- data %>%
  group_by(transmission) %>%
  summarize(frequency = n())

ggplot(data = group_trans, aes(x = transmission, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Prevodovka")+
  scale_y_continuous(breaks = seq(0, 500000, by = 20000)) 

Vidíme že sa potvrdzuje fakt, že Američania nevedia riadiť manuál a že najlepším bezpečnostným systémom áut je manuálna prevodovka.

What is that stick?

VIN (vin)

nrow(data[is.na(data$VIN),])
## [1] 187572

Kategorický atribút označujúci výrobné číslo vozidla.

Pohon (drive)

nrow(data[is.na(data$drive),])
## [1] 134188
length(unique(data$drive))
## [1] 4
unique(data$drive)
## [1] "rwd" "fwd" NA    "4wd"
group_drv <- data %>%
  group_by(drive) %>%
  summarize(frequency = n())

ggplot(data = group_drv, aes(x = drive, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Pohon")+
  scale_y_continuous(breaks = seq(0, 500000, by = 20000)) 

Vidíme že väčšina z predávaných je poháňaná náhonom na všetky štyri kolesá. Veľkou skupinou dát sú aj dáta s neuvedenou hodnootu tohto atribútu.

Veľkosť (size)

nrow(data[is.na(data$size),])
## [1] 321348
length(unique(data$size))
## [1] 5
unique(data$size)
## [1] NA            "full-size"   "mid-size"    "compact"     "sub-compact"
group_size <- data %>%
  group_by(size) %>%
  summarize(frequency = n())

ggplot(data = group_size, aes(x = size, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Velkosť")+
  scale_y_continuous(breaks = seq(0, 500000, by = 20000)) 

Size vyjadruje veľkosť vozidla. Asi je to optional atribút na craigsliste, pretože ho nemá väčšina záznamov.

Karosárske prevedenie (type)

nrow(data[is.na(data$type),])
## [1] 112738
length(unique(data$type))
## [1] 14
unique(data$type)
##  [1] "other"       "sedan"       "SUV"         "pickup"      "coupe"      
##  [6] "van"         NA            "truck"       "mini-van"    "wagon"      
## [11] "convertible" "hatchback"   "bus"         "offroad"
group_type <- data %>%
  group_by(type) %>%
  summarize(frequency = n())

ggplot(data = group_type, aes(x = type, y = frequency)) +
  geom_bar( stat = "identity", fill= 2, color="#ffffff") +
  ylab("Frekvencia") +
  xlab("Typ")+
  scale_y_continuous(breaks = seq(0, 500000, by = 20000)) +
  theme(axis.text.x = element_text(angle = 90))

Párová analýza

Vplyv typu paliva na počet najazdených míľ

ggplot(data, aes(x = fuel, y = odometer)) + ylim(10,500000) + geom_boxplot()
## Warning: Removed 60218 rows containing non-finite values (stat_boxplot).

Podľa boxplotu máme možnosť vidieť, že pri dieslových motoroch sa autá predávajú s vyšíím počtom najazdených míľ. Čo nás prekvapilo, je napríklad pomer hybridných a benzínových (gas) motorov. Domnievame sa však, že veľa dát je taktiež ešte nezatriedených v dátach s prázdnym atribútom fuel.

Vplyv náhonu na počet najazdených míľ

ggplot(data, aes(x = drive, y = odometer)) + ylim(10,500000) + geom_boxplot()
## Warning: Removed 60218 rows containing non-finite values (stat_boxplot).

Vplyv typu prevodovky na počet najazdených míľ

ggplot(data, aes(x = transmission, y = odometer)) + ylim(10,500000) + geom_boxplot()
## Warning: Removed 60218 rows containing non-finite values (stat_boxplot).

Vplyv typu paliva na cenu vozidla

ggplot(data, aes(x = fuel, y = price)) + ylim(10,300000) + geom_boxplot()
## Warning: Removed 35401 rows containing non-finite values (stat_boxplot).

Vplyv typu náhonu na cenu vozidla

ggplot(data, aes(x = drive, y = price)) + ylim(10,300000) + geom_boxplot()
## Warning: Removed 35401 rows containing non-finite values (stat_boxplot).

Vplyv typu prevodovky na cenu vozidla

ggplot(data, aes(x = transmission, y = price)) + ylim(10,300000) + geom_boxplot()
## Warning: Removed 35401 rows containing non-finite values (stat_boxplot).

Prevodovka - palivo - počet najazdených míľ

ggplot(data, aes(x=transmission, y=odometer, color=fuel)) + 
  ylim(10,500000) +
  geom_point(size=6) +
  theme_bw()
## Warning: Removed 60218 rows containing missing values (geom_point).

Prevodovka - cena - náhon - palivo

ggplot(data = data, aes(x = transmission,y = price, shape = drive, colour= fuel)) + ylim(10,300000) + geom_jitter(size = 4) +  xlab("Prevodovka")
## Warning: Removed 157887 rows containing missing values (geom_point).

Párová analáza cena - nájazd - rok výroby

pairs(~price+odometer+year, data = data)

Zhrnutie problémov

V datasete sme identifikovali viacero problémov, medzi ne patria:

  1. Vychýlené hodnoty (outliers)
  2. Zle vyplnené polia
  3. Prázdne hodnoty

Vychýlené hodnoty

Do tejto kategórie spadájú kvantitatívne atribúty ako sú odometer, cena a rok. Pri odometri máme viacerých outlierov, ktorí nám pripadajú, že sú ich hodnoty buď vymyslené alebo vznikli chybou vyplnenia. Pri cene to je také isté, niektoré modely majú prevýšenú cenu niekoľko krát, veľa inzerátov má zase nulovú cenu. Problémom pri týchto dátach ktoré máme je, že nie je jednoduché tieto dáta doplniť alebo ich nahradiť. Jednotlivé inzeráty by bolo potrebné rozdeliť podla značky, následne podľa modelu, roku, najazdených kilometrov atď., pretože cena závisí od týchto parametrov. Takýmto krokom by sme si mohli aj sami zaviesť nepresnosti do dát, čím by mohli byť skreslené. Preto sa z našeho pohľadu neoplatí vkladať značný effort na doplnenie týchto dát a radšej dané záznamy vymažeme.

Zle vyplnené polia

Viaceré atribúty ako sú model, výrobca, prevodovka majú prázdne hodnoty. Buď je to zle vyplnené užívateľom, alebo crawler ktorý dáta sťahoval mal implementačnú chybu. Niektoré atribúty by bolo možné opraviť, ale bolo by potrebné špecifické dáta, ktoré sme k dispozícii nenašli. Ide o to, že nie každý model ma všetky dostupné konfigurácie. Niektoré modely nemusia mať manuálne prevodovky pokiaľ majú pohon predných kolies a podobne. Preto bude najjednoduchšie tieto záznamy odstrániť.

Prázdne hodnoty

Prázdne hodnoty sme detekovali skoro v každom atribúte. Pri niektorých atribútoch nemáme možnosť ako ich doplniť - model, výrobca, palivo atď, pretože nepoznáme konfiguráciu vozidla. Pri niektorých to ani nemá význam, pretože ich majitľl nezadal a nevieme odhadnúť skutočný popis vozidla - stav, typ karosérie, vin číslo. Preto tieto hodnoty budeme brať ako plus a možno majú dôvod vyššej ceny.

Prevažne pri všetkých typoch problémov je pre nás najjednoduchšie odstrániť tieto záznamy. Pri atribútoch, ktoré nie sú vyplnené z dôvodu toho, že majiteľ ich nezadal, budeme k nim pristupovať ako k možnej príčine vyššej ceny. Niektoré prázdne alebo nedefinované hodnoty môžeme rozdeliť medzi ostatné atribúty, čím rozloženie ostane zostane rovnaké - napríklad typ prevodovky.

Definovanie hypotéz

  1. Cena vozidla je vyššia v prípade že v inzeráte je definované VIN číslo.
  2. Trend počtu valcov v autách klesá spolu s rastúcim rokom (Môže za to downsizing motorov a snaha o odbremenenie životného prostredia)
  3. Dieslové autá ktoré majú pohon na všetky kolesá zvyčajne majú v inzerátoch vyšší počet najazdených míľ.
  4. Rozdeliť oblasť inzerátov na 3 časti východná, stredná a západná časť USA podľa súradníc a odsledovať, aké je rozdelenie využívaných aut a či je v strednej časti viac využívaných pickupov ako iných modelov.

Budeme postupovať metódou hypothesis-driven. Budeme sa snažiť overiť naše stanovené hypotézy a overiť ich pravdivosť a v závere zhodnotiť.

Čistenie dát

Odstránenie duplikátov

nrow(data[duplicated(data[,3:19]),])
## [1] 55473
data[c(60,80),]
##            id
## 60 7229265094
## 80 7226011186
##                                                                                       url
## 60 https://auburn.craigslist.org/ctd/d/sacramento-2014-subaru-impreza-20i/7229265094.html
## 80 https://auburn.craigslist.org/ctd/d/sacramento-2014-subaru-impreza-20i/7226011186.html
##    region                    region_url price year manufacturer   model
## 60 auburn https://auburn.craigslist.org 12998 2014       subaru impreza
## 80 auburn https://auburn.craigslist.org 12998 2014       subaru impreza
##    condition cylinders fuel odometer title_status transmission
## 60 excellent      <NA>  gas    99598        clean    automatic
## 80 excellent      <NA>  gas    99598        clean    automatic
##                  VIN drive size  type paint_color
## 60 JF1GPAU63E8325853   4wd <NA> wagon      silver
## 80 JF1GPAU63E8325853   4wd <NA> wagon      silver
##                                                             image_url
## 60 https://images.craigslist.org/00F0F_j7yIeuD1XCp_0cU09G_600x450.jpg
## 80 https://images.craigslist.org/00Y0Y_3nTaoVwOMG4_0cU09G_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            description
## 60                                                                                                                                                                                                                                                                                                                                                       2014 *** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ***    Drive it home today. Call (Or Text) us now !!Call (or text) ☏ (916) 778-3115  916 Auto Sales 4020 Marysville Blvd., Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://916autosales.v12soft.com/cars/13432713    \t\t\tYear : 2014\t\t\t\tMake : Subaru\t\t\t\tModel : Impreza\t\t\t\tTrim : 2.0i Sport Limited AWD 4dr Wagon\t\t\t\t   Mileage : 99,598 miles\t\t\t\tTransmission : Automatic\t\t\t\tExterior Color : Silver\t\t\t\tInterior Color : Black\t\t\t\tSeries : 2.0i Sport Limited AWD 4dr Wagon Wagon\t\t\t\tDrivetrain : 4WD\t\t\t\tCondition : Excellent\t\t\t\tVIN : JF1GPAU63E8325853\t\t\t\tStock ID : 8325853\t\t\t\tEngine : 2.0L H4\t   \tDescription of this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon \t \tMeet our versatile 2014 Subaru Impreza 2.0i Sport Limited 5 Door Hatchback shown in Ice Silver Metallic. Powered by a proven 2.0 Liter BOXER 4 Cylinder that provides 148hp while connected to an innovative CVT. This All Wheel Drive achieve incredible fuel economy of near 36mpg plus Subaru's lower center of gravity provides better handling and serves up that that sports car feel. Load up your friends and head out for some highway fun while looking sharp with 17-inch aluminum-alloy wheels, fog-lights, and raised roof rails. Inside our Limited, see plenty of room for your family or fishing buddies plus gear and the family dog. Slide behind the wheel and take in the ergonomic design filled with amenities to provide for a comfortable road trip that start with heated leather seating, Automatic climate control, and an amazing audio system. The exterior lines scream sports car while the interior provides ample space for cargo or passengers. Legendary Subaru reliability and exemplary safety features are abundant to protect the ones you love. No longer does an All Wheel Drive translate to the family truckster. Subaru offers instant power to the wheels, which will have you loving your Impreza at first drive. Print this page and call us Now... We Know You Will Enjoy Your Test Drive Towards Ownership! Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today pls call 916-888-6888 or 916-6041234 ....WE HAVE FINANCING AVAILABLE.. ~ NO CREDIT ~ NO PROBLEM ~ YOUR JOB IS YOUR CREDIT~ ~ EXTENDED WARRANTY AVAILABLE ~ ~ WE ACCEPT MOST INCOMES TYPES INCLUDING SSI & DISABILITY~ We over 60 cars and trucks to choose from. We have the perfect car for you We finance ~ YOUR JOB IS YOUR CREDIT~ Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today  CLEAN TITLE  CLEAN CARFAX  SMOG IN HAND WE WORK WITH VARIOUS BANKS TO GET YOU APPROVED REGARDLESS OF YOUR CREDIT.  BAD CREDIT NO PROBLEM  NO CREDIT NO PROBLEM FIRST TIME BUYERNO PROBLEM NO LICENCENO PROBLEM WE WORK WITH MULTIPLE LENDERS TO COVER ALL TYPES OF CREDIT SHORT TIME AT THE JOBNO PROBLEM WARRANTY 3 MONTH OR 3000 MILES From the Third-party, IF QUALIFY Some Restriction may apply LOW to NO Down Payment (On Approved Credits) First Time Buyers Special Programs (Low Interest Rate, Low Monthly Payment, Low to No Down Payment) Hispanic Buyers Financing Programs (No Driver's License, ITIN numbers, Low to No Down Payment) Bad Credit/No Credit Financing Programs USAA Navy Federal Seawest Coast Guard Credit Union For more info ,PLS VISIT US @ WWW.916AUTOSALE.COM OR CALL US @ .....916 826-4043 , 916-604-1234 or 916-888-6888 4020 marysvile blvd sac ca 95838 . DISCLAIMER : Prices are subject to change without notice. Internet special prices might not reflect actual sale prices. Please contact our dealership for details. Price also does not include finance charges, finance fees, lender fees, taxes, gov. fees and other sale related charges.        Call (or text)  (916) 778-3115 for quick answers to your questions about this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon.   100% APPROVAL with OACWe Guarantee It! No matter your financial situation, you will drive off the lot in a new car, today.✅ Bad Credit✅ No Credit ✅ Repossession✅ SSI✅ Disability✅ Government AssistanceYOU'RE APPROVED!🚗 🚕 🚙  916 Auto Sales   🚗 🚕 🚙☎ CALL OR TEXT (916) 778-3115🔴  BAD CREDIT, GOOD CREDIT WE HAVE A VARIETY OF OPTIONS FOR YOU!!!🔵 IN-HOUSE FINANCING. 🔴 WITH OVER TWO-DOZEN LENDERS AVAILABLE, WE CAN PROVIDE A FINANCING SOLUTION TO MOST ANY CREDIT HISTORY.🔵 WARRANTY AVAILABLE🚘 TRADE/SELL/BUY ✅ GAP INSURANCE AVAILABLE ✅ FIRST TIME BUYER, CREDIT PROGRAM↪ CHECK OUT OUR INVENTORY AThttp://916autosales.v12soft.com/cars/13432713  ***** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon *****  2015, 2016, 2017, 2014, 2013, 2012, 2011, Subaru Impreza, Forester, Impreza, Legacy, Outback, Tribeca, Impreza Outback Sport, Impreza WRX STi, BRZ, XV Crosstrek, Impreza WRX, XV Crosstrek Hybrid, WRX, WRX STI   Disclaimer : * Please Note All Inventory And Inventory Pricing Is Subject To Change Please See Dealer For Further Details *     Drive it home today. Call (Or Text) us now !!Call (or text) ☏ (916) 778-3115  916 Auto Sales 4020 Marysville Blvd., Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://916autosales.v12soft.com/cars/13432713   2014 14 *Subaru* *Impreza* *Cheap 2.0i Sport Limited AWD 4dr Wagon* \t\t*Like New 2014 2.0i Sport Limited AWD 4dr Wagon Wagon* *2.0L H4* \t\t*Must See 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline - \t\t2014 Subaru Impreza  impreza IMPREZA 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon Cheap -  \t\t2014 Subaru Impreza (2.0i Sport Limited AWD 4dr Wagon) Carfax Gasoline 2.0L H4 -  \t\t2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon 2.0L H4 Gasoline  -  \t\tSubaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon   \t\t*SCHEDULE YOUR TEST DRIVE 2014 Subaru Impreza  2.0L H4 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon*   \t\t*Subaru* *Impreza* 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon   \t\t*2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon  \t\t*916 Auto Sales* *Call (or text) us today at (916) 778-3115.* \t\t2015 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon 2.0L H4 - \t\tHave you seen this 2016 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ?  \t\tMust See 2017 Subaru Impreza  2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon  \t\t*For Sale Impreza* *Impreza* *Carfax 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon  \t\tCome test drive this amazing *Subaru* *Impreza* *(2.0I SPORT LIMITED AWD 4DR WAGON)* *Gasoline* Wagon 2.0i Sport Limited AWD 4dr Wagon Wagon Gasoline Wagon Gasoline* \t\t*(Subaru)* *(Impreza)* *2.0i Sport Limited AWD 4dr Wagon* *2.0L H4* *(GASOLINE)* *Bad Credit* \t\t*Gasoline* *Wagon*  *Super Vehicle Gasoline Call (or text) this number (916) 778-3115* *2.0L H4* *916 Auto Sales* * Good Credit* \t\t2014 2013 2012 2011  \t\t*This vehicle is a used Subaru Impreza* *No Credit* \t\t*It is like New 2.0i Sport Limited AWD 4dr Wagon* *2.0L H4 Gasoline*  \t\t*Gasoline* 2010 2009 2008 2007 2006 2005
## 80 2014 *** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ***    Ready To Upgrade Your Ride Today? We Make It Fast & Easy!Call (or text) ☏ (916) 619-1849  Motor Sports Sac 4020 Marysville Blvd, Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://motorsportsac.v12soft.com/cars/13442935    \t\t\tYear : 2014\t\t\t\tMake : Subaru\t\t\t\tModel : Impreza\t\t\t\tTrim : 2.0i Sport Limited AWD 4dr Wagon\t\t\t\t   Mileage : 99,598 miles\t\t\t\tTransmission : Automatic\t\t\t\tExterior Color : Silver\t\t\t\tInterior Color : Black\t\t\t\tSeries : 2.0i Sport Limited AWD 4dr Wagon Wagon\t\t\t\tDrivetrain : 4WD\t\t\t\tCondition : Excellent\t\t\t\tVIN : JF1GPAU63E8325853\t\t\t\tStock ID : 8325853\t\t\t\tEngine : 2.0L H4\t   \tDescription of this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon \t \tMeet our versatile 2014 Subaru Impreza 2.0i Sport Limited 5 Door Hatchback shown in Ice Silver Metallic. Powered by a proven 2.0 Liter BOXER 4 Cylinder that provides 148hp while connected to an innovative CVT. This All Wheel Drive achieve incredible fuel economy of near 36mpg plus Subaru's lower center of gravity provides better handling and serves up that that sports car feel. Load up your friends and head out for some highway fun while looking sharp with 17-inch aluminum-alloy wheels, fog-lights, and raised roof rails. Inside our Limited, see plenty of room for your family or fishing buddies plus gear and the family dog. Slide behind the wheel and take in the ergonomic design filled with amenities to provide for a comfortable road trip that start with heated leather seating, Automatic climate control, and an amazing audio system. The exterior lines scream sports car while the interior provides ample space for cargo or passengers. Legendary Subaru reliability and exemplary safety features are abundant to protect the ones you love. No longer does an All Wheel Drive translate to the family truckster. Subaru offers instant power to the wheels, which will have you loving your Impreza at first drive. Print this page and call us Now... We Know You Will Enjoy Your Test Drive Towards Ownership! Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today pls call 916-888-6888 or 916-6041234 ....WE HAVE FINANCING AVAILABLE.. ~ NO CREDIT ~ NO PROBLEM ~ YOUR JOB IS YOUR CREDIT~ ~ EXTENDED WARRANTY AVAILABLE ~ ~ WE ACCEPT MOST INCOMES TYPES INCLUDING SSI & DISABILITY~ We over 60 cars and trucks to choose from. We have the perfect car for you We finance ~ YOUR JOB IS YOUR CREDIT~ Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today  CLEAN TITLE  CLEAN CARFAX  SMOG IN HAND WE WORK WITH VARIOUS BANKS TO GET YOU APPROVED REGARDLESS OF YOUR CREDIT.  BAD CREDIT NO PROBLEM  NO CREDIT NO PROBLEM FIRST TIME BUYERNO PROBLEM NO LICENCENO PROBLEM WE WORK WITH MULTIPLE LENDERS TO COVER ALL TYPES OF CREDIT SHORT TIME AT THE JOBNO PROBLEM WARRANTY 3 MONTH OR 3000 MILES From the Third-party, IF QUALIFY Some Restriction may apply LOW to NO Down Payment (On Approved Credits) First Time Buyers Special Programs (Low Interest Rate, Low Monthly Payment, Low to No Down Payment) Hispanic Buyers Financing Programs (No Driver's License, ITIN numbers, Low to No Down Payment) Bad Credit/No Credit Financing Programs USAA Navy Federal Seawest Coast Guard Credit Union For more info ,PLS VISIT US @ WWW.motorsportsac.com OR CALL US @ .....916 544 3125 916-604-1234 or 916-888-6888 4020 marysvile blvd sac ca 95838 . DISCLAIMER : Prices are subject to change without notice. Internet special prices might not reflect actual sale prices. Please contact our dealership for details. Price also does not include finance charges, finance fees, lender fees, taxes, gov. fees and other sale related charges.        Call (or text)  (916) 619-1849 for quick answers to your questions about this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon.   ⭐ Great Bank Financing Options Available ⭐✅ Bad Credit? ✅ No Credit? ✅ First Time Buyer? ✅ We Work With Dozens Of Lenders To Get You Approved Fast Regardless Of Your Credit Situation. 🚘 Ready To Get Behind The Wheel Of This Great Car 🚘 👉 Go to :100% APPROVAL with OACWe Guarantee It! No matter your financial situation, you will drive off the lot in a new car, today.✅ Bad Credit✅ No Credit ✅ Repossession✅ Bankruptcy✅ Foreclosure✅ SSI✅ Disability✅ Government AssistanceYOU'RE APPROVED!🚘 Motor Sports Sac 🚘 ✅ Huge Selection Of Quality Pre-Owned Cars ✅ Leave The Lot With Confidence Ask About Our Competitive Extended Warranties ✅ Trade-In Your Car Today For A Great Discount  ✅ We Buy Cars Cash  📍 Stop By Today And See Why Our Dealership Is Always The People's Choice💥 Check Out More Of Our Great Cars On Craigslist Just Copy And Paste This Link Into Your Browser 💥 https://auburn.craigslist.org/search/ctd?query=motorsportsac.v12soft.com  ***** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon *****  2015, 2016, 2017, 2014, 2013, 2012, 2011, Subaru Impreza, Forester, Impreza, Legacy, Outback, Tribeca, Impreza Outback Sport, Impreza WRX STi, BRZ, XV Crosstrek, Impreza WRX, XV Crosstrek Hybrid, WRX, WRX STI   Disclaimer : * Please Note All Inventory And Inventory Pricing Is Subject To Change Please See Dealer For Further Details *     Ready To Upgrade Your Ride Today? We Make It Fast & Easy!Call (or text) ☏ (916) 619-1849  Motor Sports Sac 4020 Marysville Blvd, Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://motorsportsac.v12soft.com/cars/13442935   2014 14 *Subaru* *Impreza* *Cheap 2.0i Sport Limited AWD 4dr Wagon* \t\t*Like New 2014 2.0i Sport Limited AWD 4dr Wagon Wagon* *2.0L H4* \t\t*Must See 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline - \t\t2014 Subaru Impreza  impreza IMPREZA 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon Cheap -  \t\t2014 Subaru Impreza (2.0i Sport Limited AWD 4dr Wagon) Carfax Gasoline 2.0L H4 -  \t\t2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon 2.0L H4 Gasoline  -  \t\tSubaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon   \t\t*SCHEDULE YOUR TEST DRIVE 2014 Subaru Impreza  2.0L H4 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon*   \t\t*Subaru* *Impreza* 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon   \t\t*2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon  \t\t*Motor Sports Sac* *Call (or text) us today at (916) 619-1849.* \t\t2015 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon 2.0L H4 - \t\tHave you seen this 2016 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ?  \t\tMust See 2017 Subaru Impreza  2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon  \t\t*For Sale Impreza* *Impreza* *Carfax 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon  \t\tCome test drive this amazing *Subaru* *Impreza* *(2.0I SPORT LIMITED AWD 4DR WAGON)* *Gasoline* Wagon 2.0i Sport Limited AWD 4dr Wagon Wagon Gasoline Wagon Gasoline* \t\t*(Subaru)* *(Impreza)* *2.0i Sport Limited AWD 4dr Wagon* *2.0L H4* *(GASOLINE)* *Bad Credit* \t\t*Gasoline* *Wagon*  *Super Vehicle Gasoline Call (or text) this number (916) 619-1849* *2.0L H4* *Motor Sports Sac* * Good Credit* \t\t2014 2013 2012 2011  \t\t*This vehicle is a used Subaru Impreza* *No Credit* \t\t*It is like New 2.0i Sport Limited AWD 4dr Wagon* *2.0L H4 Gasoline*  \t\t*Gasoline* 2010 2009 2008 2007 2006 2005
##    state     lat      long             posting_date
## 60    al 38.6411 -121.4286 2020-11-11T13:31:41-0600
## 80    al 38.6411 -121.4286 2020-11-05T13:48:15-0600

Môžme vidieť že v našom datasete, aj na pomerne veľkej vzorke atribútov sme našli veľký počet duplikátov. Takéto duplikáty mali síce rôzne atribúty ako url a id - domnievame sa teda, že boli predajcom nahraté do craigslistu viackrát a preto ich dropneme.

data <- data[!duplicated(data[,3:19]),]

Spracovanie jednotlivých atribútov

price

typeof(data$price)
## [1] "double"
summary(data$price)
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
##          1       5995      12495      47355      22990 3615215112      28066
data <- data[!is.na(data$price),]

Okrem NA hodnôt sme sa rozhodli pozrieť na veľmi vysoké hodnoty ktoré sme videli v prieskumnej analýze. Vybrali sme preto horný a dolný threshold, a rozhodli sme sa vychýlené hodnoty sme odstrániť. Zvolili sme hraničnú hodnotu, nad ktorou nám prídu ceny áut nereálne a vymyslené.

top_threshold = 1111111
bottom_threshold = 500

data <- data[!data$price >= top_threshold,]
data <- data[!data$price <= bottom_threshold,]
boxplot(data$price, las=2)

ggplot(data = data, aes(sample=price)) +
  stat_qq() +
  stat_qq_line() +
  scale_y_continuous(breaks = seq(0, 1000000, by = 50000))

percentil <- function (x) {
    quantiles <- quantile( x, c(.05, .95 ))
    x[ x < quantiles[1] ] <- quantiles[1]
    x[ x > quantiles[2] ] <- quantiles[2]
    return(x)
}
data$price <- percentil(data$price)
boxplot(data$price, las=2)

ggplot(data = data, aes(sample=price)) +
  stat_qq() +
  stat_qq_line() +
  scale_y_continuous(breaks = seq(0, 5000000, by = 50000))

Zarovnaný koniec a začiatok ukazuje o tom ako 95 a 5 percentil funguje. Vychýlené hodnoty sme prevažne odstránili, čím distribúcia dát je lepšia ale stále dáta nepochádzajú z normálneho rozdelenia.

year

typeof(data$year)
## [1] "double"
data$year <- as.integer(data$year)
summary(data$year)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    1900    2007    2012    2010    2016    2021     955

Skúsime doplniť roky z atribútu description:

data[is.na(data$year),]$year <- apply(data[is.na(data$year),], MARGIN=1, FUN=function(row) {
  year <- str_extract(row['description'], "([1-2][8,9,0]\\d{2})")
  return(as.integer(year))
})
nrow(data[is.na(data$year),])
## [1] 18

Vidíme, že sme boli úspešní v doplnení a zvyšný, malý (18) počet dropneme.

data <- data[!is.na(data$year),]
ggplot(data = data, aes(x=year)) +
  geom_histogram(bins = 121, fill= 6, color="#ffffff")

manufacturer

typeof(data$manufacturer)
## [1] "character"
nrow(data[is.na(data$manufacturer),])
## [1] 14104

Teoreticky by sa tento atribút dal vyplniť opäť pomocou regexu z description, pokiaľ by sme mali ďalší dataset, napríklad zoznam všetkých výrobcov vozidiel, ale rozhodli sme sa tieto chýbajúce riadky odstrániť.

data <- data[!is.na(data$manufacturer),]

model

Ako sme spomínali v analýze, vhodným spôsobom na vyplnenie chýbajúcich hodnôt modelu by bol positive lookahead v atribúte description. V skratke, v popise inzerátu vezmeme prvé slovo po výrobcovi. Čiže ak výrobcu máme “Mazda” a model je NA, a v prípade, že v description je “Mazda 6”, tak do modelu vložíme “6”.

nrow(data[is.na(data$model),])
## [1] 3774
data[is.na(data$model),]$model <- apply(data[is.na(data$model),], MARGIN=1, FUN=function(row) {
  model <- str_extract(row['description'], regex(sprintf("(?<=%s\\s)\\w+", row['manufacturer']),ignore_case=TRUE))
  return(model)
})
nrow(data[is.na(data$model),])
## [1] 1654
data[is.na(data$model),][1:5,]
##              id
## 206  7239785162
## 646  7236608785
## 1502 7230263884
## 1866 7227495669
## 2308 7234774355
##                                                                                            url
## 206        https://bham.craigslist.org/cto/d/gainesville-2017-ram-4x4-for-sale/7239785162.html
## 646     https://bham.craigslist.org/cto/d/trussville-1955-chevy-belair-hardtop/7236608785.html
## 1502   https://bham.craigslist.org/cto/d/springville-1937-ford-coupe-streetrod/7230263884.html
## 1866         https://bham.craigslist.org/cto/d/trussville-1956-chevy-210-trade/7227495669.html
## 2308 https://dothan.craigslist.org/cto/d/dothan-2005-gmc-ukon-denali-all-wheel/7234774355.html
##          region                    region_url price year manufacturer model
## 206  birmingham   https://bham.craigslist.org 18000 2017          ram  <NA>
## 646  birmingham   https://bham.craigslist.org 26500 1955    chevrolet  <NA>
## 1502 birmingham   https://bham.craigslist.org 40900 1937         ford  <NA>
## 1866 birmingham   https://bham.craigslist.org 18000 1956    chevrolet  <NA>
## 2308     dothan https://dothan.craigslist.org  4500 2005          gmc  <NA>
##      condition   cylinders fuel odometer title_status transmission  VIN drive
## 206       <NA>        <NA>  gas    95000        clean    automatic <NA>  <NA>
## 646       <NA>        <NA>  gas       NA        clean    automatic <NA>  <NA>
## 1502 excellent 8 cylinders  gas    13000        clean    automatic <NA>   rwd
## 1866      <NA>        <NA>  gas       NA        clean    automatic <NA>  <NA>
## 2308      fair        <NA>  gas   158000        clean    automatic <NA>  <NA>
##           size  type paint_color
## 206       <NA>  <NA>        <NA>
## 646       <NA>  <NA>        <NA>
## 1502 full-size coupe        blue
## 1866      <NA>  <NA>        <NA>
## 2308      <NA>  <NA>         red
##                                                               image_url
## 206  https://images.craigslist.org/00n0n_cZ7RUc9IPIl_0t20CI_600x450.jpg
## 646  https://images.craigslist.org/00t0t_5M1MnIqbIVS_0CI0t2_600x450.jpg
## 1502 https://images.craigslist.org/00202_9Zl2QmcObLX_0CI0t2_600x450.jpg
## 1866 https://images.craigslist.org/00A0A_2DwIHe0Xiuv_0CI0t2_600x450.jpg
## 2308 https://images.craigslist.org/00q0q_iGsYoGLAtG9_0lM0t2_600x450.jpg
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           description
## 206                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Great truck everything works ding on back finder 4x4 works great hwy miles $18,000 cash 💸 cell #  show contact info
## 646  For Sale or Trade ??(BUT TRADE IS HIGHER )I have a 1955 Belair Hardtop,The 55 still needs a little work but the hard stuff is done my price will go up the more work I do still needs door panels and head liner Might have them made in the next few days, Had a 4 speed in it the pedal still in it and the 55 has power disk brakes has a mild built 350/350 transmission, Paint is new and the bumpers front and rear are new, Still needs a few pieces of chrome but could have them in a couple of days,The Original Seats have been recoverd and look good new Rims and Tires . Thanks Doug I Like OLD 30 sand 40s Coupes and Muscle cars
## 1502                                                                                                                                                                                                                                                                                                                                Minotti glass, 350 Corvette engine,700R, Mustang II, 8 inch Ford, Coddington wheels,Walker, Lokar, Dakota Digital,Vintage Air, P/B, P/S, cruise, tilt, Carrera, Pdrs, P/T, grey leather, Sony Sound, Boston Acoustics, ghost flames, loaded. NSRA Safety 23 inspected. Call  show contact info . No Texts Please.
## 1866           (TRADE IS HIGHER ) I have a 1956 210 with a 350/350 motor and transmission shifts good lights all work has good brakes , runs and drives good,the 56 has been under a carport for a few years just sitting but still runs and drives good. Car has a little rust in the pans but has been patch so you can drive it while you work on it, Im still working on it every day so price will change I'm asking 18k look around and see what a Tri 5s is going for, I like to trade but be real with your offers. AND NOTHING OVER 1972 I LIKE OLD 30s COUPES AND OLD HOT RODS ,CAR SOLD WITH BILL OF SALE THANKS DOUG. 205-five 08- six112
## 2308                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  2005 UKON ALL WHEEL DR.  257000 MI. GOOD. MECHANICAL COND. FOR MILAGE! USE NO OIL, NO SMOKE, FAIR BFG AT. TIRES
##      state      lat      long             posting_date
## 206     al 32.82100 -88.15890 2020-12-01T07:50:35-0600
## 646     al 33.63390 -86.59810 2020-11-24T18:54:33-0600
## 1502    al 33.76742 -86.46558 2020-11-13T10:23:53-0600
## 1866    al 33.63390 -86.59810 2020-11-08T11:39:44-0600
## 2308    al 31.14810 -85.37180 2020-11-21T11:51:54-0600

Náhradu považujeme za úspešnú, dokázali sme takto nahradiť vyše polovicu chýbajúcich hodnôt. Takýto spôsob síce nie je 100-percentný, ale aspoň niečo. Nie vždy sa názov modelu nachádza za názvom výrobcu. Spôsob ktorým by sme toto teoreticky mohli riešiť je vytvorenie histogramu modelov, a určenie thresholdu (hranice) a nahradenie modelov, ktoré budeme považovať ako outlierov naspäť za NA. Napríklad, v prípade že nejaký model je vo všetkých inzerátoch len 1 alebo 2x, je vysoká pravedepodobnosť že ide o chybu.

condition

typeof(data$condition)
## [1] "character"
nrow(data[is.na(data$condition),])
## [1] 132279
unique(data$condition)
## [1] "good"      "excellent" NA          "like new"  "fair"      "salvage"  
## [7] "new"

Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.

data <- data[!is.na(data$condition),]

cylinders

typeof(data$cylinders)
## [1] "character"
nrow(data[is.na(data$cylinders),])
## [1] 44763
unique(data$cylinders)
## [1] "8 cylinders"  "4 cylinders"  "6 cylinders"  NA             "10 cylinders"
## [6] "5 cylinders"  "3 cylinders"  "other"        "12 cylinders"

S týmto atribútom by sme mohli spraviť nasledovné:

Aby sme s ním neskôr mohli pracovať, premeníme ho na numerický nominálny atribút a neznámu hodnotu other ozačíme ako NA.

data[which(data$cylinders == 'other'),]$cylinders <- NA
data[!is.na(data$cylinders),]$cylinders <- apply(data[!is.na(data$cylinders),], MARGIN=1, FUN=function(x) str_extract(x['cylinders'],'\\d+'))

A ešte prekonvertujeme na integer

data$cylinders <- as.integer(data$cylinders)
unique(data$cylinders)
## [1]  8  4  6 NA 10  5  3 12

Dropneme riadky

data <- data[!is.na(data$condition),]

fuel

typeof(data$fuel)
## [1] "character"
nrow(data[is.na(data$fuel),])
## [1] 9
unique(data$fuel)
## [1] "gas"      "diesel"   "other"    "hybrid"   "electric" NA

Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.

data <- data[!is.na(data$fuel),]

odometer

Atribút obsahuje vychýlené hodnoty počtu najazdených kilometrov (míľ)

typeof(data$odometer)
## [1] "double"
summary(data$odometer)
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
##          0      39600      89000     108566     138124 2043755555      16227
boxplot(data$odometer, las=3)

ggplot(data = data, aes(sample=odometer)) +
  stat_qq() + 
  stat_qq_line() +
  scale_y_continuous(breaks = seq(0, 1000000, by = 250000)) 
## Warning: Removed 16227 rows containing non-finite values (stat_qq).
## Warning: Removed 16227 rows containing non-finite values (stat_qq_line).

Rozhodli sme sa dropnúť nad určený threshold.

top_threshold <- 1000000
data <- data[!data$odometer > top_threshold, ]
data <- data[!is.na(data$odometer),]
data$odometer = percentil(data$odometer)
boxplot(data$odometer, las=2)

ggplot(data = data, aes(sample=odometer)) +
  stat_qq() + 
  stat_qq_line() +
  scale_y_continuous(breaks = seq(0, 5000000, by = 50000)) 

transmission

typeof(data$transmission)
## [1] "character"
nrow(data[is.na(data$transmission),])
## [1] 52
unique(data$transmission)
## [1] "other"     "automatic" "manual"    NA

Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.

data <- data[!is.na(data$transmission),]

vin

VIN číslo odstraňovať nebudeme, pri tomto atribúte chceme zistiť, či má inzerát vyššiu cenu pokiaľ tento atribút existuje.

Pridáme si však atribút ktorý neskôr budeme potrebovať, a to boolean hodnotu či dané VIN máme.

data$VIN_defined = apply(data, MARGIN=1, FUN=function(x) !is.na(x['VIN']))

drive

typeof(data$drive)
## [1] "character"
nrow(data[is.na(data$drive),])
## [1] 37021
unique(data$drive)
## [1] "rwd" "fwd" NA    "4wd"

Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.

data <- data[!is.na(data$drive),]

size

typeof(data$size)
## [1] "character"
nrow(data[is.na(data$size),])
## [1] 84329
unique(data$size)
## [1] NA            "full-size"   "mid-size"    "compact"     "sub-compact"

Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.

data <- data[!is.na(data$size),]

type

typeof(data$type)
## [1] "character"
nrow(data[is.na(data$type),])
## [1] 1912
unique(data$type)
##  [1] "pickup"      "SUV"         "sedan"       "truck"       "van"        
##  [6] "convertible" "hatchback"   "coupe"       "mini-van"    NA           
## [11] "wagon"       "other"       "offroad"     "bus"

Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.

data <- data[!is.na(data$type),]

long

typeof(data$long)
## [1] "double"
summary(data$long)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
## -160.88 -104.70  -86.79  -92.28  -79.93  150.90     570
data <- data[!is.na(data$long),]

Príprava dáta pre 4. hypotézu - podľa zemepisnej dĺžky rozdelíme inzeráty na východ, stred a západ USA. Hodnoty zemepisnej dĺžky sme približne určili podľa tejto mapy:

mapa usa

data$location <- ifelse(as.double(data$long) > -85, 'east', 
  ifelse(as.double(data$long) < -110, 'west', 'mid')
)

Zhrnutie spracovania

Ako sme spomínali skôr nie je možné dopĺňať stávajúce dáta, pretože prepojenosť dát je príliš veľká na to aby sme mohli jednotlivé záznamy dopĺnať. Mohli by vzniknúť kombinácie záznamov ktoré ani reálne neeexistujú. Preto je pre nás prínosnejšie tieto záznamy odstrániť, než si vedome vnášať vysoký bias do dát. Tieto atribúty sú dôležité pre naše hypotézy ohľadom ceny vozidla, nedal by sa nahradiť nijakým spôsobom (priemer, rozdelenie so zachovaním distribúcie a iné), čo by nám poškodilo autenticitu dát, preto sme sa ich rozhodliť odstrániť.

Väčšinu problémov ktoré sme mali sme vyriešili vymazaním týchto záznamov, alebo ich úpravou pomocou quartilov.

Overenie hypotéz

Cena vozidla je vyššia v prípade že v inzeráte je definované VIN číslo.

kruskal.test(x = data$price, g = as.factor(data$VIN_defined))
## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$price and as.factor(data$VIN_defined)
## Kruskal-Wallis chi-squared = 6230.4, df = 1, p-value < 2.2e-16
ggplot(data, aes(x=VIN_defined,y=price)) + geom_boxplot()

Na boxplote je možno vidieť že priemerná cena vozidla je naozaj vyššia, v prípade že vozidlo má uvedené VIN číslo v inzeráte. Rovnako aj Kruskal-Wallace test vyššiel s p-hodnotou menšiou ako 0.05, týmpádom nulovú hypotézu zamietame (ak je definované VIN číslo, cena sa nemení aj keď nie je definované) a môžme vyhlásiť, že definovanie VIN čísla v inzeráte má vplyv na cenu vozidla.

Trend počtu valcov v autách klesá spolu s rastúcim rokom (Môže za to downsizing motorov a snaha o odbremenenie životného prostredia)

chisq.test(data$year, data$cylinder, correct=FALSE)
## Warning in chisq.test(data$year, data$cylinder, correct = FALSE): Chi-squared
## approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  data$year and data$cylinder
## X-squared = 7307.4, df = 558, p-value < 2.2e-16
ggplot(data = data, aes(x = year, y = cylinders)) +
   stat_summary(geom = "line", fun = mean)
## Warning: Removed 1903 rows containing non-finite values (stat_summary).

Na porovnanie dvoch kategorických (aj keď numerických) atribútov cena a počtu valcov sme vybrali Chi-squared test. P hodnota je veľmi malá (2.2e-16), a preto opäť zamietame nulovú hypotézu. Na grafe môžme taktiež vidieť, že skutočne (aj keď počet vozidiel so zvyšujúcim rokom rastie), so zvyšujúcim rokom sa znižuje počet valcov v motoroch.

Dieslové autá ktoré majú pohon na všetky kolesá zvyčajne majú v inzerátoch vyšší počet najazdených míľ.

data$awd_diesel <- apply(data, MARGIN=1, FUN=function(x) x['drive'] == '4wd' & x['fuel'] == 'diesel')
kruskal.test(x = data$odometer, g = as.factor(data$awd_diesel))
## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$odometer and as.factor(data$awd_diesel)
## Kruskal-Wallis chi-squared = 209.39, df = 1, p-value < 2.2e-16
ggplot(data, aes(x=awd_diesel,y=odometer)) + geom_boxplot()

Na boxplote, aj podľa p hodnoty kruskal-wallis testu môžme vidieť že autá s pohonom na všetky kolesá a s typom paliva diesel majú vyšší počet priemerne najazdených míľ.

Rozdeliť oblasť inzerátov na 3 časti východná, stredná a západná časť USA podľa súradníc a odsledovať, aké je rozdelenie využívaných aut a či je v strednej časti viac využívaných pickupov ako iných modelov.

ggplot(data, aes(x = type, group=location, fill=location)) +
  geom_bar(position = "dodge" )

Na grafe môžme vidieť, že vo východnej USA je najväčší počet sedanov. V strednej časti USA sa najviac využívajú SUV autá.

Bayesovská štatistika

Je jav, že inzerované auto má automat nezávislý od toho či je inzerované auto pickup ?

bn

Pravdepodobnosť, že auto má automat:

print(nrow(data[data$transmission == "automatic", ])/nrow(data))
## [1] 0.9235323
PA = nrow(data[data$transmission == "automatic", ])/nrow(data)

Pravdepodobnosť, že auto je pickup:

print(nrow(data[data$type == "pickup", ])/nrow(data))
## [1] 0.07651808
PP = nrow(data[data$type == "pickup", ])/nrow(data)

Pravdepodobnosti stavov P(M=1|P,A)

MID = 1 PICKUP = 1 AUTOMAT = 1

data_pick = data[data$type == "pickup", ]
data_pick_automat = data_pick[data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p1a1 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.3754455

MID = 1 PICKUP = 1 AUTOMAT = 0

data_pick = data[data$type == "pickup", ]
data_pick_automat = data_pick[!data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p1a0 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.2807775

MID = 1 PICKUP = 0 AUTOMAT = 1

data_pick = data[!data$type == "pickup", ]
data_pick_automat = data_pick[data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p0a1 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.3558854

MID = 1 PICKUP = 0 AUTOMAT = 0

data_pick = data[!data$type == "pickup", ]
data_pick_automat = data_pick[!data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p0a0 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.303495

Chceme overiť nezávislosť P a A. \(\ P(P,A|M) (nevieme spočítať) = P(A|M) * P(P|M)\) - tento vzťah musí platiť pokial maju byť javy nezávislé.

Spočítame pravdepodobnosť cez všetky stavy:

\(\ P(M=1|P,M) = sum_{P,A} P(M=1,P,A)*P(P)*P(A)\)

Pm1ap = Pm1p0a0*(1-PP)*(1-PA) + Pm1p0a1*(1-PP)*(PA) + Pm1p1a0*(PP)*(1-PA) + Pm1p1a1*(PP)*(PA)
Pm1ap
## [1] 0.3531285

Vypočítame si pravdepodobnosť pre každý stav na základe, že inzerát pochádza zo strednej ameriky.

$ P(P,A|M=1) = $

$ P(P=1,A=1|M=1) $

Pp1a1m1 = Pm1p1a1*(PP)*(PA) / Pm1ap
Pp1a1m1
## [1] 0.07513291

$ P(P=0,A=1|M=1) $

Pp0a1m1 = Pm1p0a1*(1-PP)*(PA) / Pm1ap
Pp0a1m1
## [1] 0.8595236

$(P=1,A=0|M=1) $

Pp1a0m1 = Pm1p1a0*(PP)*(1-PA) / Pm1ap
Pp1a0m1
## [1] 0.004652342

$(P=0,A=0|M=1) $

Pp0a0m1 = Pm1p0a0*(1-PP)*(1-PA) / Pm1ap
Pp0a0m1
## [1] 0.06069112

Teraz sa pozrieme na druhú časť vzťahu \(\ P(A|M) * P(P|M)\)

1.\(\ P(P=1|M=1) = \sum_{A} P(P=1,A|M=1)\)

Pp1m1 = Pp1a1m1 +  Pp1a0m1
Pp1m1
## [1] 0.07978525

1.\(\ P(A=1|M=1) = \sum_{P} P(P,A=1|M=1)\)

Pa1m1 = Pp1a1m1 + Pp0a1m1
Pa1m1
## [1] 0.9346565

Ak by javy P a A boli nezávislé, tak musí pre všetky stavy platiť, že: $ P(P=1,A=1|M=1) =P(P=1|M=1) * P(A=1|M=1) $

if(Pp1a1m1 == (Pp1m1*Pa1m1)) 
{
  print('Nezávislé')
} else {
  print('Závislé')
}
## [1] "Závislé"

BN